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Each year a large number of students who lack suitable 
characteristics, either in capacity or in training or in both, 
enroll as freshmen in schools of engineering. There seem to 
be certain qualifications requisite for those who are to follow 
the profession of engineering successfully, but these have not 
yet been determined clearly. In order that vocational and 
educational guidance may be made more scientific, these suita- 
ble characteristics should be determined and made measurable 
by means of differentiating tests. 

Oregon State College registration records show that 38.1 
per cent of the students who enrolled as freshmen in the 
School of Engineering did not re-enroll as sophomores in that 
school the following year. A tabulation of similar data shows’ 
that of 5,338 freshmen in engineering in 25 large schools of 
engineering in the United States, 39.1 per cent of the students 
enrolled as freshmen did not re-enroll as sophomores. These 
records show’ also that only 28.1 per cent of the students who 
enroll as freshmen complete the standard courses in engineer- 
ing. While the time, effort, and money of those who do not 
continue has not been entirely lost during the time they were 
in the schools of engineering, they have not been used as 
valuably as they might have been. As a result of the emo- 
tional shock of failure or unexpected difficulty, undue nega- 
tivism and attitudes of inferiority are often established in 
those students who drop out. The financial losses to the 
schools of engineering in the partial training of these students 

1A Study of the Admissions and Eliminations of Engineering Stu- 
dents. Bulletin 2 of the Investigation of Engineering Education, 1926. 
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are considerable also. Of the total eliminations of students 
who drop out, more than 50 per cent take place during the 
freshman year. 

The primary problem of this study is that of finding the 
prognostic values of several measuring devices in terms of 
actual achievement in a school of engineering. 

This study brought out the fact that there were two signifi- 
cantly different groups of students who were in difficulty 
scholastically and who could be studied with some degree of 
accuracy and thoroughness. These students may be segre- 
gated by selecting those who made low scores in the college 
entrance intelligence examination and low college grades as 
one group, and those students who made high scores in the 
college entrance intelligence examination and low college 
grades as a second group. The students who fall into these 
two groups cause approximately 95 per cent of the difficulty 
in student guidance and counseling. The most serious prob- 
lem as far as loss of able student material is concerned is with 
the students of the second group. One would consider any 
student who made a high score in the college entrance ex- 
amination as a good risk in a school of engineering, that is, 
his chance of failure in his studies is small compared with 
those of the student who made a low score on the entrance 
examination. While some students in this high ability group 
do fail for various reasons, not as many fail as in the low 
ability group. It may be *hat they have been forced into the 
study of engineering by parental authority or have entered 
it because of the retention of an adolescent glamor or remain 
in it because of a false pride of choice. It may be lack of 
industry, lack of initiative, or lack of incidental practice and 
application of principles in practice that seems to be the im- 
mediate cause of their failure. There are other more indi- 

vidual causes. 
In the year 1929-1930, 55 per cent of the freshman engineers 
who were in the low quartile of the college entrance examina- 
tion scores failed to register for the third term of the college 
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year. These students were graduated from accredited high 
schools where at least some kind of educational and vocational 
guidance was available. Of those who enrolled in spite of 
advice to the contrary, all claimed an intense interest or at 
least what they believed to be interest in the subject. They 
believed that interest and industry would carry them through 
in spite of lack of ability. It does not seem to. 

Two other groups of students with whom these first two 
should be compared are, first, the students who made average 
scores on the entrance examination and average grades in 
engineering subjects, and, second, those who made high scores 
in the entrance examination and high college grades. These 
students are succeeding in their chosen studies; and, if they 
are placed in the wrong profession, there are no indications 
of this fact. It is entirely possible that they might do better 
work in other fields of endeavor, but no positive proof is avail- 
able since they seldom change their courses. These may be 
assumed to act as control groups who are doing satisfactory 
work in the school of engineering. 

For this study the students were divided into four groups. 
These are: 

Group A—those who made high college grade averages for 
two quarters of their freshman year and high scores in the 
college entrance intelligence examination. 

Group B—those who made low college grade averages for 
two quarters and low scores on the entrance test. 

Group C—those who made low college grade averages for 
two quarters and high scores on the entrance test. 

Group D—those who made median college grade averages 
and average scores on the college entrance intelligence ex- 
amination. 

KIND OF TESTS USED 

The following tests were used: 

1. The American Council on Education Psychological Ex- 
amination for High School Seniors and College Freshmen, by 
Thurstone and Thurstone. 
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2. Iowa Silent Reading Test, Advanced Examination, by 
H. A. Greene and A. N. Jorgensen. 

3. Thorndike Test of Word Knowledge, by E. L. Thorndike. 

4. Ascendance-Submission Reaction Study, by G. W. and 
F. H. Allport. 

5. McQuarrie Test for Mechanical Ability, by L. W. Me- 
Quarrie. 

6. Stenquist Mechanical Aptitude Test No. 1, by J. L. 
Stenquist. 

7. Stenquist Mechanical Aptitude Test No. 2, by J. L. 
Stenquist. 

8. Stenquist Assembly Test Series No. 1, by J. L. Stenquist. 

9. Strong Vocational Interest Analysis Blank, by E. K. 
Strong, Jr. 


The analysis of the data will consist of intercorrelations 
among the scores. The correlation between the psychological 
test scores of the entire group studied and their college grades 
was +0.555. The relatively low correlation is due, among 
other things, to the limited number of cases and to the in- 
clusion of ‘‘Group C’’ in the computations. If ‘‘Group C,’’ 
those with the high psychological test scores and low grades, 
had been left out a much higher correlation would have re- 
sulted. Although this group can hardly, at the present time, 
be identified as unsatisfactory students of engineering before 
they have entered the school of engineering, they should be 
identified by the end of their first year. 

The correlation of the scores on the Iowa Silent Reading 
Test with college grades for the total group is + 0.394. It is 
evident that the Iowa Silent Reading Test and Psychological 
Examination measure closely similar qualities since the corre- 
lation of the two is + 0.870. 

The data show that reading ability is at least an asset to 
good grades but not a positive indicator. It is doubtful, how- 
ever, that a student can make a success of his college work in 
engineering if he can not read above the 30th percentile, at 
least, of the high school norms as given for this test. No 
student in ‘‘Group A’’ is below the 50th percentile. 





Witireniieaccs. 


opin iS — 





Pe eee eee 








ENGINEERING APTITUDE 111 


The correlations of the scores on the Thorndike Word 
Knowledge Test with those from the Psychological Examina- 
tion, the college grades, and the lowa Silent Reading Test are 
similar—that with the psychological examination, + 0.772, 
being the highest. Apparently, it is unreasonable to attempt 
to teach the ‘‘Group B”’ students engineering as they have a 
very mediocre comprehension of the ideas which their school 
work requires that they understand. 

The Strong Vocational Interest Analysis Blank does not 
show a high correlation with the other tests. College grades 
correlate + 0.322 with it. The one man in ‘‘Group A’’ who 
scored a minus on this test has a marked special interest in 
physies and plans to do research work in this subject upon 
graduation. 

The A-S Reaction Study Blank gave very low correlations 
with all of the other tests. The correlations were for the most 
part negative, the largest negative correlation being with col- 
lege grades, —0.404. This apparently indicates greater sub- 
missiveness on the part of the best students of engineering. 
The trend of the poorer students is toward the ascendance 
side, especially those who made high scores in the entrance 
psychological examination. This test, although it gives no 
very high correlations with the other tests, does indicate stu- 
dent attitude very well as compared with instructor opinion. 
Perhaps its greatest value is that it fosters self-criticism on 
the part of the student himself. 

The MeQuarrie Mechanical Aptitude Test gives a fair cor- 
relation with the Iowa Reading Test, +0.478. The wide 
scattering of the scores on the MeQuarrie test indicates that 
reliance can not be placed in the test as an aid in placing 
students accurately in engineering courses. The test does, 
however, imply that a student who receives a score of less 
than 60 will probably do unsatisfactory college work. Only 
one student who made a score of less than 70 is doing satis- 
factory work in Oregon State College. 
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The Stenquist Mechanical Aptitude Test No. 1 gives a low 
correlation with all of the other tests and with the college 
grade average — with the exception of the other Stenquist 
tests. All of the students made approximately the same scores 
regardless of their status as students. The students who made 
the very low scores were unfamiliar with the objects pictured 
in the test. They could not identify from the pictures such 
simple objects as a meat chopper, an adz, and a hatchet handle, 
yet they were familiar with the parts of a forge or a gas en- 
gine. The question arises whether or not they should not have 
known about these objects through incidental observation. 

The Stenquist Mechanical Aptitude Test No. 2 gave con- 
siderably higher correlations with the other tests than did 
either of the other mechanical aptitude tests. The group 
scores are relatively close together but ‘‘Group A’’ is on the 
average above the other groups. Although the correlation 
with college grades is not high, + 0.428, the test does seem to 
measure engineering aptitude to a certain extent. Exercise 
3 of the test requires a definite ability to do some abstract 
thinking and a power to visualize what will happen when a 
load is placed on a structure. The students are called upon 
to work problems of this kind in their everyday college work 
and one would expect the better students to make the better 
grades. The students of the poorer groups nearly all worked 
the first two exercises but failed on definite parts of the third. 

The Stenquist Assembly Test, Series 1, gave a wider range 
of scores than the other Stenquist tests and much lower corre- 
lations. No reason is apparent for some of the lower group 
students being as low as they were on this test unless it is lack 
of observation of objects familiar in everyday life. No student 
who ranked as average in the psychological examination re- 
ceived as low a score as three of the students in ‘‘Group D,’’ 
the lowest group. However, the correlation of the psychologi- 
cal examination with this test is practically zero. Correla- 
tions simply indicate a possibility of predicting scores and the 
low correlations in the case of the mechanical tests do not 
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mean that one could not arrive at a lower limit score beyond 
which an engineering student would probably fail in his work. 


SUMMARY 


The psychological examination is apparently the best device 
for segregating students into ability groups in engineering at 
the present time. Students who fall below the 20th percentile 
of the college group have very little chance, if any, of ever 
graduating in engineering. 

The Iowa Reading Test measures much the same abilities 
as the psychological examination. Students who score below 
the 30th percentile as determined for high school norms in 
silent reading will probably not succeed in engineering. 

The Thorndike Word Knowledge Test did not give any con- 
clusive results as far as this study is concerned. 

Students who have a tendency toward ascendance as mea- 
sured by the A-S Reaction Study Blank should be watched 
closely during their first college year as they have a decided 
trend toward low grades in engineering. Use of the A-S Re- 
action Study Blank is beneficial in guidance in that it causes 
the students to analyze themselves and to think more clearly 
of their personality characteristics. 

The Strong Vocational Interest Analysis Blank will indi- 
cate extreme cases of lack of interest in engineering. It will 
also locate the border-line cases which the adviser of ctudents 
should watch, since some of the border-line cases are students 
who wish to specialize in a particular field of engineering and 
their interests in the broad field as measured by the blank are 
low. The blank can be used only as an indicator of interest 
and not as predictive of college engineering success. 

The Stenquist Mechanical Aptitude No. 1 and the Stenquist 
Assembly Test, Series 1, gave no indication of predicting 
academic success in engineering. The Stenquist Test No. 2, 
if used carefully and with the results of the psychological 
examination gives a fair clue to correct educational guidance. 
The McQuarrie Test for Mechanical Aptitude could be used 








3 








Aenea A thon: ET pee 


ee RE 


races eB. mives be 


+. a lie AR A me 








ENGINEERING APTITUDE 115 


for the same purpose. Either test has too low correlation with 
college grades to justify placing much reliance on it alone. 

The value of the scores of any one of the tests, except 
possibly the psychological examination, as a means of pre- 
dicting academic success in engineering is very small. If, 
however, the educational counselor has the scores of a student 
on the psychological examination, the interest analysis blank, 
and in extreme cases, the Stenquist No. 2, he can give much 
more accurate advice in a personal interview with the student 
than he could without them. 











RELIABILITY, VALIDITY AND DEPENDABILITY* 
Ww. V. BINGHAM 


Personnel Research Federation, New York 


Mathematics. as an instrument of industrial psychology 
(psychoteenique), is useful only as it serves the specific aims 
of this science. Chief of these aims is: the adjustment of the 
individual worker to his occupation and to the conditions of 
his employment in a way which increases both his output 
and his happiness. Whether the industrial psychologist ap- 
proaches this task in order to increase the welfare of the state, 
or the industry, or the group or class, or the employer, he 
knows that his aim can be achieved only as he succeeds in in- 
creasing the well-being of the individual worker. 

Bearing in mind the central importance of the individual, 
this paper emphasizes the need for keeping in mind the limi- 
tations of certainty, in interpreting the significance of single 
measures of an ability. In addition to the familiar mathe- 
matical concepts of reliability and of validity, the industrial 
psychologist should think also in terms of what has been called 
the dependability’ of the test. 

The reliability of a test is defined as its self-correlation. 
This is ordinarily ascertained either (1) by repeating the test 
and correlating the two sets of measurements, or (2) by cor- 
relating data obtained with two similar forms of the test (or 
the odd items and even items, or the first half with the second 
half, and correcting for length of test). The first is the upper 
limit and the second the lower limit of the so-called coefficient 
of reliability. Such coefficients are used to ascertain the ade- 
quacy of sampling. They indicate how closely the measures 
will probably be duplicated on repetition. The reliability of 
a test, in this sense, may be low because of: 


* Paper prepared for the Seventh International Conference of Psycho- 


tecnique held in Moscow, Russia, September, 1931. 
1H. M. Johnson: Proceedings of the Ninth International Congress of 


Psychology, New Haven, 1929, p. 240-241. 
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(a) Errors in measurement, due to flaws in the test or in 
its administration. 

(b) Restriction in range of ability measured. 

(ec) Variability in the ability measured; that is, change in 
the ability through training, or susceptibility of the subjects 
to embarrassment, fatigue, incentives to effort, and other tem- 
porary influences. 

If a test has a high coefficient of reliability, the inference 
may be drawn that the abilities it measures are relatively 
stable, and that the test as an instrument of measurement is 
free from variable errors. If, however, the reliability is low, 
a remedy is sought in perfecting the instrument of measure- 
ment, or in increasing the length of the test, or the number of 
repetitions of the measure; or, the test is abandoned. 

It is of equal importance to know the reliability of the 
criterion against which a test is validated. The criterion with 
which the psychologist is here concerned is some measure of 
success or performance in the occupation, which the test is 
designed to predict. This criterion, in the case of automobile 
drive:.. “or example, may be the number and severity of col- 
lision accidents per unit of exposure. The available criterion 
may be unreliable because the records are incomplete, or inac- 
curate, or difficult to express in comparable units. But the 
industrial psychologist scrutinizes the reliability of his eri- 
teria quite as critically as he does the reliability of his tests, 
and bends every effort to make them accurate and serviceable. 

The validity of a test is defined as the closeness of its corre- 
lation with the chosen criterion. Validity, in this sense, can- 
not in the nature of the case, be any higher than the reliability 
of the test or of its criterion, and may be much lower. For 
this reason, investigators sometimes omit the determination of 
reliability. They proceed at once to correlate test and cri- 
terion, knowing that if the validity proves to be high, the 
reliabilities must of necessity also be high. 

Employment managers or vocational counselors may do an 
individual grave injustice unless they clearly understand the 
limitations of dependability of the tests they use. When is 
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the validity of a test high enough to be of practical service in 
predicting the criterion in a particular case? 

In answering this question, the need for caution in inter- 
preting coefficients of correlation arises, because of the natural 
tendency to infer that a test with a coefficient of validity of 
.60, is 60 per cent as valid (in the ordinary sense) as a perfect 
test. Actually it is only 20 per cent. What we really want 
to know about a test score is how closely it predicts the cri- 
terion. This is best expressed as 1—k, the difference between 
unity and the standard error of estimate, Kelley’s ‘‘ coefficient 
of alienation.’’ Since k= \/1-r’*, we may write 

Dependability = 1 - Vv1l-r 


Such a ‘‘coefficient of dependability’’ gives an idea of the 
reliability of the unknown variable which is predicted from a 
given value of the known variable. 

Dependability is necessarily small unless the coefficients of 
reliability and validity are high. As the coefficient of validity 
increases in size from zero to 1.00, the dependability also 
increases, but much more slowly until the coefficient of validity 
approaches .70. The relationship between these two measures, 
the coefficient of dependability and the so-called coefficient of 
validity, is the same as that between the sine and the versed 
sine. It is shown graphically in Figure 1 on page 120. 

A glance at this figure shows how little dependability can 
be placed in predictions based on a correlation of .36; but 
when adoption of better criteria, better batteries of tests or 
better methods of administration raises the correlation to .70 
or .80, the probability that any individual measure will pre- 
dict the criterion becomes rapidly greater. The table on page 
119 gives a few coefficients of correlation (r) and the corre- 
sponding coefficients of dependability (D). 

When nothing better is available as a basis on which to 
forecast a young man’s future performance, by all means let 
us continue to use tests whose correlation with the selected 
criterion is only .60, or less. But let us at the same time tell 
him that the prediction is not much better than a chance 
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r D 
1.00 1.00 
995 90 
99 86 
98 80 
.96 72 
.93 .63 
.90 56 
.80 40 
.70 29 
-60 2 
50 13 
44 -10 
37 .07 
.28 .04 
.20 .02 
14 01 
-10 .005 
.00 00 


estimate. If his performance in several tests, each with only 
moderate validity, nevertheless points in the same vocational 
direction, our basis for confidence increases. It is further 
strengthened if they harmonize with what we know of his 
interests and ambitions, temperament, emotional stability, 
social and economic background, family traditions, educa- 
tional accomplishments, and previous occupational experi- 
ence. Ideally, we would like to reduce all these considerations 
to quantitative formulations, and assign each its proper weight 
along with the test scores, using when possible the multiple 
regression equation, in order to make the best possible pre- 
diction. Actually, many important considerations are not 
susceptible to mathematical treatment of this kind. Some of 
them, instead, can ke conveniently dealt with by the method 
of critical scores an. group comparison, to which reference 
was made in my paper on ‘‘ Neglected Methods in Employment 
Psychology’’ read at the Paris Conference of Psychotechnol- 
ogy in 1927.? 

2 Comptes rendus, p. 228. See also, Bingham and Freyd, Procedures 
in Employment Psychology, Chapters XIII, XV, and XVII. 
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Unless a coefficient of correlation is high, the likelihood of correct pre- 
diction from a test score is very low. 


In using this method, a comparison is made of two groups, 
the successful and the unsuccessful employees; those who 
remain in the occupation, and those who leave it; those who 
learn quickly, and those who take a longer time to acquire the 
necessary vocational skills; those who earn a promotion, and 
those who do not; those who have few accidents, and those who 
have many; those who sell more than their quota, and those 
who sell less; and so on. A test score above which or below 
which the members of one group are found in a conspicuously 
greater proportion than the members of the other group is 
ealled a critical score. A range marked off by critical scores 
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is called a preferred range, or critical section. Critical scores 
may be ascertained by use of Pearson’s formula for bi-serial 
r; or they may be determined by inspection after the scores 
made by members of the two groups—successes and failures— 
have been entered on one chart, successes being indicated by 
one kind of symbol and failures by another kind. The inves- 
tigator then notes the sections of the distribution in which one 
group is represented in relatively greater proportion than the 
other group. These critical sections are then set off by 
critical scores. Sometimes the preferred range lies between 
two critical scores because workers whose abilities are above 
a certain maximum score soon become dissatisfied and leave 
the occupation before they have become competent. (Here 
the relation between test score and success is non-linear, so 
that the Pearson coefficient of correlation cannot be used.) 
The significance of these critical sections may be ascertained 
by comparing the difference in proportions of the two groups 
found in the critical sections, use being made of the familiar 
formula for the standard error of the difference in propor- 


tions: 
€,.= [Pt _ & 
ny n, 





In making an individual prognosis on the basis of critical 
scores in several tests, each may be assigned a weight in 
accordance with its relative significance ; that is, varying with 
the size of the ratio of the difference and its standard error. 
In other words, the percentages of the different groups in a 
eritical section are ascertained, and the difference in these 
percentages is divided by the standard error of the difference. 
The quotient indicates the weight to be assigned to scores in 
that critical section. If an individual scores within a critical 
section he is credited with the amount of this quotient, in 
place of his raw score. His total score is the sum of such 
values. While this is by no means the only convenient way 
of combining items of information in order to secure the best 
prediction, it is a method which has increasingly commended 





122 W. V. BINGHAM 


itself to investigators. It has this merit, that it is applicable 
to data which do not lend themselves to mathematical treat- 
ment by the method of correlation. 

Summary. This paper has emphasized three points. The 
first is, the need for caution in estimating the predictive value 
of individual scores. It is of the greatest importance to the 
individual, that the psychologist should know not only the 
reliability of a test, and its validity as ordinarily measured; 
he should keep in mind also its dependability, whenever use 
is made of individual scores in predicting vocational success. 

Our second point emphasizes the need for giving due con- 
sideration to all the relevant available data about the indi- 
vidual ; for his vocational success and happiness in his career 
will depend not only on his abilities, but also on his tempera- 
ment, his physique and health, his interests, his emotional 
stability, his educational, social and economic status, and 
many other factors. 

Some of these factors can be quantified, reliably measured, 
and correlated with the criterion of vocational success in the 
same manner as the scores on a controlled test of performance ; 
but some of these data cannot be so measured, and correlated 
by means of the familiar Pearson techniques. This leads to 
our third point, namely, that such data should be evaluated 
by the method of group comparisons, and given weights pro- 
portional to the significance of the group differences. Most 
informing and easily understood by the individual who is 
being advised, is a method of assigning a weight to each score 
in accordance with the probability that an individual making 
that score will achieve success. 

When all pertinent data about an individual are secured, 
and given due consideration in the light of their known sig- 
nificance, the psychologist puts himself in a position to render 
the best counsel. The dependability of his service to the in- 
dividual is the measure of his own success as an industrial 
psychologist. 
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MEASURING ABILITY TO FAKE OCCUPATIONAL 
INTEREST 


HARRY CHARLES STEINMETZ 
San Diego State College, San Diego, California 


The validity of personality and interest tests, as well as of 
tests of general mental ability, is greatly dependent upon 
honesty of report, upon strict adherence by the testee to the 
terms of test administration. This is commonly accepted as 
an important fact, but to the writer’s knowledge there is no 
published experimental evidence of the sort here submitted. 

A manuscript by Dr. E. Lowell Kelly entitled ‘‘ An Experi- 
mental Study of the Ability to Influence Scores on a Typical 
Paper and Pencil Test of Personality Traits’’ was written for 
Dr. Lewis M. Terman and Dr. Catherine Cox Miles, of Stan- 
ford University, during the spring of 1930. The procedure 
followed by Dr. Kelly with the writer’s students, of retesting 
with complete faking requested, is here invoked again. Dr. 
Kelly’s instrument was the Stanford Masculine-Feminine 
Test; the present report is concerned with the Strong Voca- 
tional Interest Blank; the two studies are in substantial agree- 
ment. Dr. E. K. Strong, Jr., has long been concerned with 
the faking of interest, and has studied such ability since 1927. 
One such study is briefly summarized here. 

3etween October, 1927, and February, 1928, 34 advanced 
students at Stanford filled out the Vocational Interest Blank 
once naively, under usual (minimum) instructions, and once 
with advice to attempt to qualify as well as possible as en- 
gineers; this called for 420 judgments apiece instead of 420 
choices. Results are given in table 1. 

Twelve (35.3%) of the ratings remained the same; 1 (2.9%) 
was lowered; 21 (61.8%) were raised. Of most significance, 
perhaps, is the fact that although 8 (23.5%) of these students 
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TABLE 1 


Positions under different instructions 











NUMBER FIRST RATING SECOND RATING 
bake. A A 
1 os A | 
13 B A 
4 B B 
1 Cc | B 
7 C | A 








did not qualify by this rating for congeniality, they were able 
to simulate, almost ideally, actual engineering interest. 
Furthermore, it was ascertained in this experiment that the 
Certified Public Accountant score of 21 (61.8%) of these 
men was raised by their effort to qualify as engineers (others 
lowered), and the Ministry score of 26 (76.5%) were lowered 
(others raised). 

The subjects in the present experiment were volunteers re- 
eruited at San Mateo Junior College (California) during 
February 1930 through a bulletin board announcement to the 
following effect : 


NOTICE TO FRESHMEN AND SOPHOMORE MEN 


Do you know what occupation you should prepare for? Are 
you majoring in studies which lead to a vocation that suits 
your interests, abilities, tastes, and attitudes? 

If you don’t know, neither does any one else. An intima- 
tion, however, is given by careful study of responses on a 
questionnaire known as the Strong Vocational Interest Blank. 
It is probably the best guide yet devised for selection of a 
congenial occupation. (Reference was here made to a nominal 
charge kindly allowed by Dr. Strong for pursuit of this 
study. ) 

This nominal rate is open only to students who seriously 
desire assistance in selection of an occupation, and who are 
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willing to appear twice, once during February and once dur- 
ing March or April, for one hour each time. Those who fill 
out the blank during February must be sure to appear for the 
second time, when first results will be given them. (Fol- 
lowing provision for indicating interest. ) 

Forty-eight students were tested on February 25 and 26, 
46 of whom were retested during the first week in April, with 
results reported in table 2. When the students met the 
examiner for the second time, the following announcement 
was made: 

‘About a month ago you filled out the Strong Vocational 
Interest Blank with the object of learning how congenial you 
would be likely to find various occupations. Your papers have 
been scored for 19 occupations, and I shall give you 19 letter 
grades, with full explanations, in a tew minutes. 

**First, however, I shall ask you to fulfill our original agree- 
ment by cooperating with me again. I am going to give each 
of you a copy of the same blank. This time I want you to 
show me how well you know the interests, attitudes, likes and 
dislikes of men school teacher-administrators. 

**Please imagine that you are an applicant for the position 
of principal of a small junior high school. There are several 
other applicants for this position, so that the board of educa- 
tion has decided to give all candidates the Strong Vocational 
Interest Blank. The board agrees that all of you have equal 
ability and that your training and experience are adequate, 
but they want a man who will find the work congenial and 
be permanent. 

**Please fill out the blank this time with these assumptions 
in mind. You want this position badly. The school is attrac- 
tive, and you will be able to arrange your own program of 
teaching and administrating. In this test you are competing 
with several more experienced men, but you know that every- 
thing depends upon ability to rate high in one thing: teacher- 
administrator. Your teacher-administrator score is the only 
one that will count. Go ahead.’’ 
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Cooperation was excellent; comment during attempt ‘‘to 
fudge’’ and afterwards indicated that each man desired to 
show knowledge of the typical man school-teacher-administra- 
tor. The readiness of the participants, as well as resulting 
scores, suggests a use of this procedure (perhaps more par- 
ticularly with personality tests) which may be found to re- 
veal important knowledge and judgment immeasurable by 
any other method. 

Results are shown in table 2. All scores used in this study 
were obtained from the Hollerith machine and were 12,600 
more than the scores secured by persons scoring by hand with 
the scoring scales provided by Strong. In order to obviate 
minus signs resulting from subtraction of 12,600, only 12,000 
was subtracted from Hollerith scores; resulting scores are in 
the first seven columns of table 2; in addition, two columns of 
means comparable to those usually employed are added. 

It may be seen that scores were affected favorably to quali- 
fication or congeniality in 9 occupations, unfavorably in 10; 
that 8 of the 19 occupations were affected with greater re- 
liability than one chance in a thousand would account for 
(see oe greater than 3); that the most favorably affected 

S.D. diff. ’ 

scores were for teacher-administrator, certified public account- 
ant, personnel manager, Y. M. C. A. secretary, minister, and 
lawyer, in this order of importance. These may be said to 
be associated in these students’ minds in this institution at 
this time. The most definitely non-associated occupations 
appear to be farmer and real estate salesman. These changes 
are despite the fact that Strong estimates the reliability of 
his blank to be at least .85. The difference herein indicated 
is not a discrepancy but must be charged to the subjective 
factor of purpose. 

Following the experiment, participants were asked to give 
their rank choices of school teaching-administrating as an 
occupation in the range of 19 for which the blank was 
analyzed. Explanation of the changes in scores was sought 
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by running correlations between: rank choice of our criterion 
occupation (C), naive score (N), faked score (F), Thorndike 
intelligence test score (T), and gain in score from naive to 
faked (G). These are given with means and standard devia- 
tion of the distributions of each measure in table 3. 


TABLE 3 


Means and intercorrelations between certain measures 








Cc N F T G 
Cc 7.22 3.58 - 21 - .13 — .07 .07 
N 566.23 + 117.77 18 .28 — .63 
F 813.00 + 213.00 35 82 
T 71.782 10.43 00 
G 246.77+ 52.45 





Correlations are Pearson product-moment coefficients ; none 
below .27 have significance. It must be remembered that C 
(choices) scores are given as ranks from 1, first choice, to 19, 
last. 

Choice of teaching as an occupation apparently tends to 
run counter to congeniality or coincidence of interests, and 
to be unrelated to intelligence or ability to improve specific 
score for such congeniality. This might not be found in 
other occupational fields, and like the other relationships is 
probably affected by the specificity of faking. 

Correlations on the second and third data lines of table 3 
denote significant associations of teaching congeniality with 
intelligence, but of a greater dissociation between congeniality 
and ability to simulate a higher interest. This can hardly be 
attributed to limitation in the upper range for those who 
naively placed high, for only one faked above the midpoint of 
the range from average F score to the limit of the test. A 
possible explanation is that it is more difficult to fake interest 
in one’s true field than in another. 

There is no relationship between ability to change score 
and intelligence, but the latter is significantly related to 
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knowledge or judgment or teaching interests (probably 
through N seores). In other words, the social insight which 
functioned in this experiment to yield high F scores is part 
of general intelligence (and the relationship must be mini- 
mized greatly ky the NG correlation of —.63). 

With a larger population and a simpler test, path coefficients 
of multiple correlation should probably be used to measure 
conjoint influences. What the ability is which functions on 
request to improve scores on personality and interest tests— 
more specific than intelligence—might be measured by this 
technique and called social insight. 


SUMMARY 


Retesting under conditions of changed instruction was 
undertaken with the Strong Vocational Interest Blank in an 
attempt to measure ability to make scores. Results seem to 
warrant the following conclusions: 

1. Students are able intentionally to distort their scores on 
an interest blank, and to succeed in qualifying well for an 
occupation chosen at random, so far as they are concerned, 
despite a low average initial predilection. 

2. Students are able to improve their scores markedly when 
they try, and this is inversely related to true score. 

3. The ‘‘typical’’ man school teacher-administrator is 
viewed by these students as a sort of cross between a religious 
social worker and an accountant; a more strict interpreta- 
tion might be that actually or in the minds of this group the 
occupations shifting together in the same direction have the 
content of this test in common. 

4. Faked marking affects not only the simulated occupa- 
tional interest of the students, but distorts seriously at least 
half of the other occupational indications. 

5. Actual rank-choice of occupation simulated bears a nega- 
tive relation to naive score and an uncertain relation to Thorn- 
dike score and to faked score and to gain in seore. Naive 
score, however, may correlate with faked score and probably 
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does with Thorndike score and certainly negatively with gain 
in score. Faked score correlates very significantly with gain 
in score but also with Thorndike score; whereas there is no 
apparent correlation between gain and Thorndike score, prob- 
ably because of the negative correlation mentioned above. 

6. Further experimentation of this sort with personality 
tests is suggested for measurement of social insight and the 
extent to which purpose may distort test results. 








ANALYSIS OF SOME FACTORS CONDITIONING 
LEARNING IN GENERAL PSYCHOLOGY* 


HOWARD P. LONGSTAFF 


University of Minnesota 


PART II 


RESULTS AND INTERPRETATION OF THE ANALYSIS BASED 
UPON CORRELATIONS 

Another approach to the question of the relative efficiency 
of the lecture-quiz (L.Q.) method and the straight lecture 
method is to correlate the achievement of each group with 
college ability test scores. If the L.Q. method is a better 
method of teaching it should motivate the students to work 
more nearly up to their ability ; consequently, the correlation 
between achievement and college ability test should be higher 
for the L.Q. Group than for the L. Group. To test this 
hypothesis, the following correlations were computed between 

grades earned for each quarter’s work and C.A.T. scores. 
It is apparent from Table 7 that the differences in the size 


TABLE 7 
This table contains coefficients of correlation between final grade at the 
end of each quarter for both L.Q. and L. Groups 
with College Ability Test 








L.Q. Group P.E. 

Final Score End of 1st quarter vs. C.A.T. .. | r=.43 + .038 

sé é sé sé 2nd ‘és ee ‘ec r=.41 + .040 
. L. Group ; 

Final Score End of 1st quarter vs. C.A.T. r=.36 + .041 

oc « «6 Oe ll .| r=.49 + .036 





of the correlations are not sufficiently large to warrant con- 
clusions of superiority for either method. Neither are the 
* Part I appeared in The Journal of Applied Psychology, Vol. XVI, 
No. 1, February, 1932. 
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coefficients consistent, the L.Q. Group correlating slightly 
higher the first quarter and the L. Group correlating higher 
the second quarter. Furthermore, analysis of the scatter plots 
showed no appreciable differences in the distribution of the 
scores around the line of regression. 

Another approach to our problem is to compare the rela- 
tionship between achievement in psychology, as measured by 
grades in the course, and expected achievement, as measured 
by the ‘‘equation secore.’’ The ‘‘equation score’’ is the aver- 
age of C.A.T. and H.P.R. It is obvious from Table 8 that 

TABLE 8 


This table contains the correlations expressing the relationship between 
achievement and ‘‘ equation score’’ 





L.Q. Group 





| P.E. 
Final Grade for 1st Quarter vs. Equation Score " | r=.63 + .028 
“é ce cé 2nd ce c< cc ce r=.60 + 031 
2 = 60 + . 
L. Group | 
Final Grade for 1st Quarter vs. Equation Score r=.60 + .031 
ce ce sé 2nd ce << se ‘ce | r-— .67 = .027 





there is very little difference in the efficiency of the two 
methods when this criterion is used. An analysis of the 
scatter plots showed that the two divisions functioned in 
much the same manner, each having about the same number 
of individuals doing better or worse than expected. 

We may conclude then that the results obtained by the 
correlation method of analysis substantiate the findings se- 
cured by measures of achievement in that no significant differ- 
ences are revealed to distinguish the two methods of teaching. 


RESULTS AND INTERPRETATION BASED UPON AN ANALYSIS 


USING THE QUESTIONNAIRE METHOD 
The measures of achievement and relationship discussed 
above yield no significant differences in favor of the lecture- 
quiz method of teaching elementary psychology. But the 
question remains whether any other available method of 
evaluation may reveal a relative superiority. It is possible 
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that the measures so far employed have left out of account 
some advantages inherent in one or the other method. These 
qualities, if they exist, may appear in the subjective impres- 
sions held by, students. To the end of discovering at least 
some trace of superiority in one or the other method, an 
analysis of student opinion in the L. and L.Q. Groups was 
undertaken. If one method of instruction is superior to the 
other it would seem that students taught by that method 
would be more favorably impressed with the course than 
those taught by the inferior method. The questionnaire re- 
sults reported in Section II were made up in part of opinions 
secured from the students in the L. and L.Q. Groups. The 
results for these students have been analyzed separately and 
will furnish the basis for our discussion of subjective opinion 
as a measure of the relative efficiency of the two methods of 
teaching under consideration. A condensed statement of the 
findings are reproduced in Table 9. 

The following questions deal with the quality of the course, 
its level of difficulty and subject-matter. Question 1 has to 
do with the question of whether the students would insist 
upon, recommend, say nothing about, discourage, or strongly 
discourage a brother or sister in next year’s sophomore class 
taking psychology. 

Question 2 asks the student to compare psychology with 
other college courses he has had from the standpoint of 
quality. He could rate the course as follows: first, in the 
highest one-fourth, above average, below average, in the lowest 
one-fourth, last. These ratings were numbered from 1, most 
favorable, to 6, least favorable. Question 3 dealt with the 
relative difficulty of psychology and other courses the students 
had taken. It was arranged the same as question 2. In ques- 
tion 4 the student was requested to compare psychology with 
the average college course which he had taken from the stand- 
point of its being more, equally, or less difficult, provocative 
of thought, interesting, valuable in connection with other col- 
lege work, and applicable to every-day life. The evaluations 
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Condensed results on questions 1, 2, 3, 4 and 7 of the questionnaire. 


HOWARD P. LONGSTAFF 


TABLE 9 





The results were condensed by averaging the per cent of students 


who answered question number 1 as 1 or 


D 


~ 


and 3 as 1, 2, or 3 and question 4 as 1 or 2. 
table reads 90% of the students in the L.Q. 
Group and 91% of those in the L. Group 
would insist or recommend that a 
relative should take the course 


if they had the opportunity. 


3; questions num- 
The 

















QUESTION QUESTION & EVALUATION L.Q. GROUP L. GROUP 
NUMBER USED HEREIN N = 460 N = 335 
1 If I had a brother or sister in 
next year’s sophomore class I 
would insist or recommend that 
he take psychology . 90% 91% 
2 Comparing it with other courses, 
I would rank it first, in the | 
upper one-fourth, above average 
in quality ....... 93% 97% 
3 | Comparing psychology with 
other courses, I would rank it | 
| first, in the upper one-fourth, 
| above average in difficulty . 82% 84% 
+ Comparing psychology with the | 
| average college course, I rate 
it more, equally, less: 
| a Difficult | 88% | 87% 
b Provoeative of thought .................. | 94% 97% 
| ¢ Interesiing a 90% | 96% 
| d Valuable for other college | 
courses we | 90% | 91% 
e Applicable to everyday life | 92% 96% 
7 I do intend to take more | 
psychology 3 | 58% 54% 
| 
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of ‘‘more,’’ were assigned the value of 1; ‘‘equally,’’ 2; and 
‘‘less,’’ 3. In answer to question 7, the student indicated 
whether he did or did not intend to take more psychology. 

It is apparent from a study of the condensed results in 
table 9 that both divisions are favorably impressed with the 
course. The students taught by the lecture-quiz method, 
however, are not more favorably disposed towards the course 
than those taught by the lecture method. In fact there is a 
slight tendency in the other direction. The differences are 
not of sufficient magnitude to be of practical significance and 
we may conclude that students’ opinions are extremely favor- 
able toward the course as taught either by the lecture or the 
lecture-quiz method. 

Two other questions included in the questionnaire furnish 
additional evidence of student opinion towards methods of 
presenting the course. These are questions 8 and 9 which 
called for a ranking of three different methods of teaching the 
course, from 1, highest, to 3, lowest. The three methods 
evaluated were; three large lectures (500 students) per week; 
two large lectures and one quiz (50 students) per week; or 
three independent section meetings per week (lecture and dis- 
cussion) (50 students). Both questions were the same except 
that question 8 was introduced with the statement, ‘‘The fol- 
lowing methods of organizing the course are practical’’; while 
question 9 was introduced by the statement, ‘‘ Assuming all 
psychology 1-2 teachers had equal and very high ability, I 
would prefer, (rank as above).’’ 

The results secured from these questions are not as compar- 
able between the two divisions as they would have been if the 
methods of teaching had been reversed the second quarter by 
the use of L-L.Q.—L.Q.-L. order for the two groups of stu- 
dents. This was not done because it was felt that the results 
obtained from our measures of achievement (which form the 
most objective part of the study) would furnish us with a 
more reliable answer to our problem if the same method of 
teaching was used for each group throughout the two quar- 
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ters. The opinions of the L.Q. Group should probably receive 
the more serious consideration as the students in that division 
had actual experience with both lecture and quiz sections, 
while the L. Group had to base their opinions either upon 
theoretical grounds or upon experiences with quiz sections in 
other courses. 

It was found that the L.Q. Group, basing their rating upon 
experience, gave two of the methods the same average rank, 
the lecture and the lecture-quiz methods. The method of small 
independent sections is ranked third by this group. When the 
methods are rated on the theoretical basis of all instructors 
having very high and equal ability, the L.Q. Group rates the 
lecture-quiz method first, the small independent sections, sec- 
ond, and the lecture method, third. The L. Group ranks the 
lecture method first on the basis of present organization and 
gives the methods of lecture-quiz and small independent sec- 
tions a lower rank. On the basis of the theoretical organiza- 
tion, this group still ranks the lecture method first, lecture-quiz 
method second and small sections third. It is to be noted then, 
that in both the L. and L.Q. Groups small sections taught by 
the same instructor throughout the quarter are ranked third, 
if these sections are to be taught after the manner character- 
istic of such courses in the University. On the theoretical 
basis of all teachers being of equal and high ability, the method 
of small sections is ranked third by the L. Group and second 
by the L.Q. Group. That both groups tend to rank the small 
sections lower than the lecture or the lecture-quiz sections is 
significant since most college work has been done in classes of 
small size and many have deplored the necessity of building 
up large lecture-quiz sections as an economical substitute. It 
is evident from these findings that both groups tend to rank 
highest the type of instruction they are receiving. We may 
conclude then that insofar as student opinion is concerned, 
large sections are considered equal or superior to small sec- 
tions. 








Poe no 


LEARNING IN GENERAL PSYCHOLOGY 137 


Additional evidence of a less objective nature has come to 
the writer’s attention. The following quotation from an edi- 
torial in the Minnesota Daily, the official student publication, 
reveals the attitude of at least one student towards the quiz 
section method: ‘‘The supplementation of lectures by recita- 
tion classes is of little, if any use. Of necessity the instruction 
obtained in these sections is, on the average, poor. The sec- 
tions are handled almost always by graduate students who 
have neither a secure grasp on the material they are teaching 
nor the ability to convey to the student such information as 
they may have. Rather than stimulating interest in the sub- 
ject, these sections are fatal to it. . . . The recitation sections 
are a failure not only from an educational standpoint, but also 
from a pedagogical standpoint. The students who need and 
might benefit from repetition use the class hour to catch up on 
back sleep, to read the College Newspaper, or to study for 
their next classes. Apparently the only ones who benefit in 
any way from the sections are the instructors who are paid 
for teaching them.’’ Another quotation from a student who 
is studying in Europe on a scholarship granted by the Insti- 
tute of International Education (20), is also interesting in 
this connection. ‘‘I think that foreign educational methods 
are quite superior to ours; lecture courses keep the student’s 
interest much better, and one learns more through them than 
through the system of recitation hours which we have in 
America, and which now seems to be somewhat juvenile.’’ It 
is probable that similar student opinion could be assembled 
in abundance to demonstrate that opinion does not necessarily 
favor the traditional small discussion type of class procedure. 

Katz and Allport (21, P.P. 79-80), investigating student 
attitude towards the lecture and the discussion methods of 
teaching found the following: ‘‘Item 10 of the Syracuse Reac- 
tion Study comprised a five step scale, one extreme of which 
read: ‘I get the greatest value from those courses which are 
conducted entirely by the lecture method.’ The opposite 
extreme placed the greatest value upon courses conducted 
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entirely by the method of ‘discussion.’ The middle or third 
step placed the greatest value upon ‘courses in which lectures 
and discussions are equally stressed.’ Apparently the stu- 
dents recognize value in each method; for the modal position 
checked in each college was the middle step.’’ Perhaps this 
reflects the fact that these students merely approve that which 
they have experienced most frequently. In college few courses 
are given by lectures alone or by discussion alone and hence 
a combination of lecture-discussion becomes the basis for a 
‘‘middle of the road’’ attitude. 

Hudelson (19, P. 112) gives information which may throw 
some light upon the usual expressed preference for small 
classes. He found that it was the opinion of both students 
and faculty that the large lecture classes were better for the 
brighter students and the small sections were better for the 
duller ones. The reason suggested by the better students for 
this situation was that in the small sections the dull student 
may go to see his instructor, put up a bold front and assume 
an attitude of great industry which may influence the instruc- 
tor in his favor. If this be true, it means that the small sec- 
tion actually does not benefit the student from the standpoint 
of actual learning but militates against the accomplishment of 
those students best able to profit from instruction. 


RESULTS AND INTERPRETATIONS BASED UPON AN 
ANALYSIS OF RETENTION 


A fourth approach to our problem is to measure achieve- 
ment and retention by means of an examination given the first 
day of the course, covering materials of the first quarter’s 
work, and repeating it at the end of both quarters of the 
course. The pre-course examination, the nature and method 
of administration which have been described previously, func- 
tioned in this capacity of measuring initial status, achieve- 
ment, and retention. 

The results obtained from these examinations were analyzed 
as follows. Each student’s score on the pre-course test was 
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divided by two, in order to make it comparable to that obtained 
upon the pre-course test repeated at the end of each quarter. 
(For it is to be remembered there were two separate sets of 
questions used in the final examinations and each set contained 
fifty of the original one hundred questions of the pre-course 
examination.) This value was subtracted from 50 and the 
difference so obtained was considered as 100 per cent possible 
improvement. The difference between the pre-course score 
and the score obtained on repetition of the test at the end of 
the first quarter was considered as the amount of improvement. 
The amount of improvement divided by the possible improve- 
ment was considered as the per cent of improvement or gain. 
Retention was measured by finding the difference between the 
score obtained on the pre-course test when given at the end of 
the first quarter and the score made on this examination when 
given at the end of the second quarter. It must be remem- 
bered that the pre-course covered only the work of the first 
quarter. This difference was divided by the amount of im- 
provement made the first quarter and this quotient was sub- 
tracted from 100; the result is the per cent of retention. Table 
10 contains a detailed analysis of the way this procedure works 
out when applied to the test papers of a given student selected 
for illustration. 
TABLE 10 











STEPS DESCRIPTION OF PROCESS RESULTS 
] Score on pre-course examination of 100 items | 20 
2 Score in Step 1 divided by two | 10 
3 Score on pre-course repeated at end of first quarter 35 
+ Amount of improvement: obtained by subtracting ‘ 
result in Step 2 from score in Step 3 25 
5 Possible improvement: obtained by subtracting re- 
sult in Step 2 from 50 (maximum possible score) | 40 
6 Percent of gain: obtained by dividing result in Step | 
4 by result in Step 5 | 65 
7 Score on pre-course repeated at end of second 
quarter | 32 
8 Amount of forgetting: obtained by subtracting | 
figure in Step 7 from that in Step 3 3 
9 Percent of Retention: obtained by dividing value in | 
Step 8 by that in Step 4 and subtracting from 100 88% 
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There are objections to these methods of measuring learning 
and retention. We have no assurance that an individual who 
has a low pre-course score, consequently a large possible im- 
provement and who actually shows a gain of 50 per cent has 
achieved as much as an individual who has only a small pos- 
sible improvement and who shows 50 per cent of gain. The 
questions used in the pre-course had been studied by the 
“*trade test’’ technique as described by Chapman (6) and 
were all known to have fairly high diagnostic value, yet to be 
sure as to what a gain of any given amount means it would 
be necessary to have all questions in the examination of known 
difficulty, a task which is nearly impossible to accomplish. 
Present methods of sealing test items are valuable in this re- 
spect but far from perfect. Holzinger (18) in discussing 
sealing makes the following statement: ‘‘scaled values are 
often an unnecessary refinement in measuring large groups as 
evidenced by the high correlation between scaled and unsealed 
items. Professor Douglas, for example, found a correlation 
of about .98 between weighted and unweighted algebra scores, 
a result which is much higher than the reliability of the tests 
themselves. He concluded that the unscaled values give the 
relative standing of pupils with sufficient accuracy for ordi- 
nary testing uses.’’ In a footnote he adds, ‘‘Dr. Scates and 
the writer have also found correlations of .994, .995, .997, and 
.998 between weighted and unweighted scores, the number of 
items weighted varying from six to ten, and the weights be- 
ing quite different.’’ Since our measures do show differential 
performance on the part of the students, and since the items 
have been roughly scaled by the trade-test technique, and in 
view of Holzinger’s statement of the high correlation between 
sealed and unscaled measures, the examinations used herein 
should be at least fairly adequate for our present purpose. 
Since whatever advantages or disadvantages inhere in our 
methods of measuring learning and retention apply equally 
to both groups of students under consideration, it follows that 
no great error will be encountered if they are employed to 
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determine the relative effectiveness of the two methods of 
teaching under discussion. 

The results obtained from the method of analysis discussed 
above are presented in table 11. It is apparent from these 
findings that the two methods of teaching employed in this 
experiment are of practically equal value when considered 
from the standpoint of achievement and retention. Although 
a slight superiority in achievement is demonstrated by the L.Q. 
Group and a slight superiority in retention is demonstrated in 
the L. Group, neither of these differences is of sufficient mag- 
nitude to be significant as they fall well within the limits of 
chance fluctuations. 

These findings are not in agreement with those cited by Bane 
(3) in a book published since the completion of this study, 
which summarizes the results of five studies conducted by him- 
self and his colleagues, Gotke, Kirby and Robbins. He also 
summarizes the work of Greene, Spence and Watson, Scheide- 
mann, and Hudelson as well as investigations conducted by 
Morris, Tuttle and Downing. 

The methods of teaching used in these studies are not di- 
rectly comparable to those used in this investigation, yet, they 
contain a sufficient number of common elements to warrant 
special mention. 

Bane concludes from these studies: ‘‘1. Experimentation 
affords no basis for the wholesale disproval of the lecture. 
2. The chief advantage of the lecture is that it is an eco- 
nomical way for imparting information for immediate use. 
3. The greatest defect of the lecture is that it does not make 
for retention of the material taught and needs to have safe- 
guards worked out for it in this respect.’’ This latter find- 
ing may be due either to a superior degree of overlearning 
on the part of our subjects or the difference in lapse of time 
between the completion of the work and the re-test. Our find- 
ings tend to support the first explanation. 

In addition then to showing very little difference in stu- 
dent achievement when taught by the two methods under 
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discussion, the results on retention throw light upon the 
efficiency of these methods as teaching devices. The meth- 
ods of teaching employed in institutions of higher learning 
are being criticized more and more each year. Formal 
courses and grades as sole criteria of success in these courses 
have been charged with leading students to lose sight of the 
task of securing an education and working only for marks, 
a situation which leads to cramming and results in rapid 
forgetting as soon as the course is over and the mark is ob- 
tained. The charge is also frequently made that our present 
system leads to pampering and spoon feeding the students 
and does not throw them upon their own resources enough. 
The system of small discussion classes might well do this, as 
was pointed out by the student whose opinion was cited 
above, concerning the relative merits of the European and 
American systems of education. The trend of student opin- 
ion cited by Hudelson also bears out this possibility. 

In contrast the system of large lectures actually puts the 
student more upon his own, and should increase his feeling 
of responsibility and dependence upon his own efforts and 
not those of his instructor. This should be especially true 
when the student knows that he is to be examined frequently. 
The method of weekly quizzes, mid-quarter and final exami- 
nations should act as a motivating device which would tend 
to result in more thorough learning on the part of the stu- 
dent. This system of examinations should lead the student 
to review each week’s work for the weekly quizzes, to review 
the work of half the quarter for the mid-quarter and to re- 
view the work of the whole quarter for the final examination. 
This process of frequent review and synthesis of the subject- 
matter studied is likely to result in overlearning and hence 
lead to good retention. But if the students feel that they 
have achieved their main purpose when the quarter’s marks 
are reported to the registrar, this system presumably would 
promote prompt and rapid forgetting as soon as the final ex- 
aminations for a given quarter are over. 
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It has been experimentally demonstrated that for barely 
learned material, forgetting is very rapid, and after a period 
of a few days most of it has been forgotten. If the charge is 
true that our present system of examinations leads to cram- 
ming to acquire a multitude of details for a specific examina- 
tion, we should expect rapid forgetting, because such a cram- 
ming process would probably result in barely learned 
materials. If the critics of our present system of teaching 
and examining are correct, then attempts to measure reten- 
tion at some future time after the books on a given quarter’s 
work have been closed should reveal a distressing amount of 
forgetting to have taken place. Our results throw light upon 
this question. The findings obtained from the pre-course ex- 
amination given at the end of the second quarter measured 
retention extending over a period of three months, with only 
the most incidental type of review possible, because the sec- 
ond quarter’s work was quite different from that of the first. 
It is evident from table 11 that the charges made against ex- 
aminations do not hold for introductory psychology, as 
taught at this institution. Our results show an average re- 
tention of 79.5 per cent. This need not mean a decrement of 
20.5 per cent for the three-month period. In all probability, 
it is much less than 20.5 per cent when we consider the prob- 
able location of the zero point. ‘‘True zero knowledge of 
psychology’’ is undoubtedly much below that represented by 
a score of zero on the pre-course examination. This consid- 
eration emphasizes the fact that these students in psychology 
have retained the materials they learned to a surprising de- 
gree. We may infer from this that the system of examina- 
tions in use probably produced a desirable degree of over- 
learning originally and thus insured adequate retention. At 
least these facts prove that examinations did not cause very 
much forgetting. Cederstrom (5) working with much the 
same problem in biology finds that ‘‘even after a lapse of a 
full year, without any opportunity for review, the students 
retain from six-tenths to eight-tenths as much as they gained 
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during the work of the course.’’ In view of these experi- 
mental results showing that students do not promptly forget 
what they learn as soon as a final examination is over, it 
would seem that the charges levied against our present sys- 
tem of examinations are easier to make than to substantiate. 

A further criticism has frequently been raised, #.e., results 
obtained by the use of objective tests measure only factual 
materials and show nothing of the students’ ability to see re- 
lationships, to synthesizes and to think in terms of the sub- 
ject-matter of college courses. Such criticism is most often 
presented by those who do not realizze the intimate interre- 
lation between mere ‘‘breadth of knowledge’’ and ability to 
see relationships. Paterson (35) has pointed out that there 
is good statistical evidence that the old anc new type of ex- 
aminations actually measure the same functions. The re- 
sults obtained in this study agree with his statistical crite- 
rion. That is, subjective examinations correlate as closely 
with objective examinations as subjective ones correlate with 
each other. We have also pointed out the fact that the ob- 
jective examinations used in this study are more valid in 
every case (four criteria being utilized) than the subjective 
ones. Furthermore, the instructors in psychology have en- 
deavored to minimize mere information in constructing ob- 
jective examination questions and to emphasize types of 
questions which would presumably measure relational think- 
ing ability. So far as intellectual grasp and subject-matter 
assimilation and achievement are concerned there is every 
reason to believe that our experiment has not been vitiated 
through any inherent weaknesses in the examining system 
now in use. 

Ignoring the evidence cited above and granting the critie’s 
point that objective examinations measure only factual ma- 
terials, there is still a condition existing which cannot be 
thrust lightly aside. All valid reasoning must, verforce, 
rest upon a basis of factual material. One cannot reason 
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unless one has something to reason about. If facts and con- 
tent of a course are of no value, why teach them? Again we 
have experimental evidence to substantiate our point. The 
content of the general psychology course has been shown to 
be considered valuable by the students who have taken the 
course. The array of facts presented upon this point in sec- 
tion II is convincing enough even for the most skeptical. 
Furthermore, the course as taught at Minnesota is the result 
of years of work by psychologists of recognized ability. They 
have included in the content of the course what in their 
judgment are the most valuable and important principles of 
psychology from the point of view of the general run of 
sophomore students. Thus a content which has ample proof 
of its value, presented by a method experimentally demon- 
strated to result in satisfactory achievement cannot be thrust 
aside by mere unsupported statements of opinion. 

In view of the facts brought out above, it would seem log- 
ical not to discard present examining methods when we have 
good evidence that they are producing efficient learning but 
rather to supplement them with even more examinations. 
Just as the cure for the evils of democracy may be more de- 
mocracy so the cure in the case of asserted evils of examina- 
tions may be more examinations. It has just been shown that 
our present system leads to efficient accomplishment in the 
individual course under discussion. If it is wished to teach 
students to combine unitary groups of accomplishment into 
larger units, let us install a system of comprehensive exami- 
nations which should cause the same type of constant review 
and coordination of learning that has been obtained in psy- 
chology, except that the units coordinated will be whole 
courses rather than topics of a particular course. Thus we 
would advocate that comprehensive examinations be given 
at critical stages in a student’s progress and treated as addi- 
tions to, rather than substitutes for, our present examining 
techniques. 
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IV. THE RELATIVE ACHIEVEMENT OF SUPERIOR STUDENTS WHEN 
TAUGHT BY THE STRAIGHT LECTURE METHOD IN LARGE 
GROUPS AND BY THE LECTURE DISCUSSION METHOD 
IN SMALL GROUPS 

The results reported in the preceding section demonstrated 
that there was apparently little, if anything, to be gained by 
the use of the lecture-quiz method in the introductory course 
in psychology. But because there was the feeling that. the 
lecture method might not be equally efficient for students 
of all levels of ability, a decision was made to experiment with 
superior students. The purpose of the investigation was to 
find the differences in the relative achievement of superior stu- 
dents taught by the straight lecture method in a large group, 
and other superior students taught by the lecture-discussion 
method in small groups. The large lecture group was made 
up of students of varying levels of ability; the small lecture 
discussion sections were composed of superior students of 
homogeneous ability. 

There is surprisingly little literature on experimental ap- 
proaches to the question of the effects of sectioning college 
students on the basis of ability. Most of it deals with the 
secondary or elementary levels of instruction.‘ From a study 
of the somewhat meager present-day literature on segregation 
of students into homogeneous groups for purposes of instrue- 
tion one generalization stands out above all others: superior 
students supposedly do better work than poor students and 
accomplish more in a shorter period of time. Consequently, 
these students should be segregated and taught by a method 
which permits them to progress as rapidly as they wish and 
cover an amount of ground commensurate with their ability. 
Because a course for a heterogeneous group of students must 
be designed to meet the needs of the largest number who are 
only average in ability, educators reason that segregation is 
the only way to meet the needs of superior students. Opinion 

11 Worlton (49), Lincoln and Wodleigh (25), Miller (30), Miller (29), 
McCall (18). 
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holds that the average course which is supposedly too difficult 
for the dull members of the class and too easy for the brighter 
members results in baffled dull students and bored superior 
students. The extremes of the student body are, then, poorly 
instructed. The logie of this generalization, which seems to 
be irrefutable, has become the accepted belief of many edu- 
cators. 

If the arguments set forth by the advocates of sectioning on 
the basis of ability are sound and hold for the teaching of gen- 
eral psychology the following experimental set-up should fur- 
nish data to prove or disprove them: 

1. A group of superior students should be divided into two 
sections; one section should be taught as a small homogeneous 
group, the other should be taught in a large group composed 
of students of mixed ability. 

2. Both sections should be measured by the same examina- 
tions. 

3. All measures should be reduced to a common statistical 
basis. 

4. Attempts to measure the opinions of the two sections 
concerning the value, difficulty, and nature of the course in 
introductory psychology should be made. 

5. Achievement of the two groups should be correlated 
against measures of ability and past performance in college 
to ascertain if possible which group was the more highly 
motivated. 

If the assumptions underlying homogeneous grouping are 
valid, the following findings should be secured from the above 
experiment: 


1. The segregated section should achieve more. 

2. Opinions of the segregated group should be more favor- 
able towards the course as a whole. They should consider it 
more valuable, interesting, provocative of thought, ete. They 
should also consider it more difficult than their average college 
course, as they are supposed to be bored with the simplicity 
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and elementariness of their average course, because it is geared 
down to the student of average ability. 

3. Correlations with ability should be higher for the segre- 
gated groups since segregation supposedly motivates them to 
work more nearly up to their capacity. 

4. Correlations with other courses (previous achievement 
should be lower, since in the segregated groups they are sup- 
posed to work harder than they have before, and consequently, 
secure better marks than when they were in heterogeneous 
groups. There should, therefore, be a wider discrepancy be- 
tween their achievement when segregated than when not segre- 
gated. This situation would lead to a reduction of the coeffi- 
cient of correlation. 


Such an experiment as that outlined above was conducted by the De 
partment of Psychology at the University of Minnesota during the fall 
and winter quarters of 1929-1930. On the basis of the results reported 
in Section III the introductory course in psychology was put on the 
straight lecture basis, beginning with the fall quarter 1929. There were 
two large lecture divisions of the course taught during the fall and 
winter quarters of 1929-1930. These two divisions, which we shall 
hereafter call I Hour Division and III Hour Division, met at the first 
and third class periods respectively on Monday, Wednesday, and Friday. 
They were conducted and taught in the same manner as the lecture group 
(L. Group) reported in section III. At the first meeting of the course 
a mimeographed sheet asking for the total number of credit hours taken 
since entering the university and total number of honor points earned, 
was given to each student in the class. He filled in the information 
requested and returned the blank at the second meeting of the course. 
On the basis of the honor point ratio secured from these sheets, and the 
College Ability Test score, secured from the University files, the highest 
120 students in each division were selected to make up the subjects for 
the experiment on sectioning. As was pointed out in section III there 
is a small percent of error entering into the Honor Point Ratio when 
secured by this method but in order to get the students sectioned by the 
beginning of the second week of school, there was little choice left as it 
was practically impossible to compute honor point ratios based on records 
in the Registrar’s Office for a thousand students in such a short period 
of time. The exact procedure of selecting the superior students was as 
follows. The honor point ratio was computed and reduced to centile 
ranks. The college ability test scores were secured for all the members 
of the group and restandardized upon the basis of all the students regis- 
tered in the course, that is, the centile ranks were computed on the basis 
of the scores of these odd thousand psychology students. The centile 
ranks for the H. P. R. and C. A. T. were added together. The sum of 
these two measures was arranged into a descending scale and the 120 
highest students in each of the two groups mentioned above were selected 
as subjects for the experiment. Sixty of each of these groups of 120 
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were placed in sections meeting alone; these were the experimental sec- 
tions. The remaining sixty were left in the large lecture groups and 
were the control sections. The members of the experimental groups were 
matched with those in the control groups on the basis of the sum of the 
centile ranks of H. P. R. and C. A. T. The segregated sections met at 
the same hours as did the unsegregated, I hour or III hour on Monday, 
Wednesday and Friday. The I hour segregated group will be designed 
as homogeneous Section W. The III hour segregated group will be 
called homogeneous Section X. The I hour unsegregated group will be 
designated as heterogeneous section Y, and the III hour unsegregated 
group will be called heterogeneous section Z. Homogeneous section W 
was taught by instructor A, described in section III and homogeneous 
section X was taught by instructor B, who was also mentioned in the 
third chapter. The two homogeneous sections W and X were taught by 
a combination method of lecture and discussion, with lectures the more 
important part of the course, but the students were encouraged to ask 
questions and initiate discussion whenever they wished. The control 
sections, (heterogeneous sections Y and Z) were taught by the straight 
formal lecture method, described in section III. The content of the 
course for all sections was the same as that outlined in the first section 
of this study. To be doubly sure that the basic content would be the 
same for all four sections the instructors of the homogeneous sections 
attended the lectures given in the heterogeneous sections. Since the 
same examinations were to be given both groups this precaution was 
taken to be sure that basic principles presented to the control sections 
would also be covered in the experimental ones. 

Approximately the same examining system was employed as that 
described in section III. The weekly quizzes were made out in confer- 
ence by all the instructors teaching the course. The objective mid- 
quarter and final examinations were also made out in joint conference 
by the instructors teaching the various sections. The mid-quarter exami- 
nation for the first quarter contained eighty questions. There were two 
separate sets of questions used in this examination which were arranged 
into four forms. Each form contained 30 analogy, 20 completion and 
30 multiple choice questions. All four forms were given both to the 
homogeneous and to the heterogeneous sections. Since the two sets of 
questions proved to be of unequal difficulty they were equated as de- 
scribed in section III. The final examination for the first quarter con- 
tained only one set of questions, these were arranged in two forms. This 
examination was made up of 77 multiple choice, 44 analogy, 20 matching 
and 34 wrong word questions. This latter type of question was devised 
by Dr. E. F. Heidbreder; a sample is reproduced in figure IV. Holmes 
(17) has analyzed this type and her results show that it is very satis- 
factory. It has a reliability of about .80 (odd-even, Spearman-Brown 
correction used). When analyzed by the Trade Test technique they were 
found to differentiate students of varying levels of ability in a satisfac- 
tory manner. The final examination was administered to all students at 
the same two-hour examination period. The mid-quarter examination 
for the second quarter contained two different sets of questions, each 
set was made up of 32 multiple choice, 27 analogy and 21 completion 
questions. These two sets of questions were arranged into fovwr forms 
and were found to be practically equal in difficulty. The final examina- 
tion for the second quarter was made up of 48 multiple choice, 38 
analogy, 45 wrong word and 30 completion questions. Only one set of 
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questions was used, arranged into two forms. The administering of the 
examination was identical with that for the first quarter. 

The various measures used were reduced to a sigma score by the use 
of the following formulae: 


X —M, 
oscore Weekly quizzes = —— 30+150 
Ox 
X — M, 
o score Mid-quarter —— 10+ 50 
Ox 
X —M, 
o score Final examination = —— 40+ 200 
Ox 


Figure IV 
Directions: In each of the following sentences, pick out the one word 
that makes the sentence wrong and write it in the space on the left. 
The wrong word is never one of the first three words in the sentence. 


unconnected A complex is a system of unconnected ideas, having a 
strong emotional tone, and tending to facilitate action in 
line with it. 


The final grade for each quarter’s work was the result of adding these 
three measures, and plotting the same into a histogram. The histogram 
was divided into five step intervals to which were assigned the letter 
grades of A, B, C, D, F. The percentage of each of these grades as- 
signed the first quarter was: A, 10.3%; B, 17.0%; C, 44.6%; D, 17.3% 
and F, 10.8%. The second quarter there were A, 10.2%; B, 17.7%; 
C, 48.4%; D, 15.8% and F, 7.9%. This method of grading weighted 
the weekly quizzes three-eighths, the mid-quarter one-eighth and the final 
examination four-eighths. 

This completes the discussion of the measures of achievement used in 
this study. Im order to have as many approaches to the problem as 
possible the questionnaire discussed in section II was also given to the 
students in this study. The questions asked and the nature of its 
administration was the same as that previously described with the excep- 
tion that the students did not sign their names. 


RESULTS AND INTERPRETATIONS 


Results which throw light upon the question of the relative 
achievement of superior students taught in small homogeneous 
sections as compared with those taught in large heterogeneous 
sections are contained in table 12. This table contains the 
results obtained by the use of the measures of achievement 
described above. These results were secured for 48 students 
in the I Hour Division and 52 in the III Hour Division. With- 
drawals, transfers, and similar changes made in the students’ 
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schedules caused this reduction from 60 to 48 and 52 subjects 
respectively. 

It is apparent from the table that, considering final grades 
in the course as a criterion of the relative value of the two 
methods of instruction, students taught as members of large 
heterogeneous sections achieve more than do students taught in 
small homogeneous sections. The differences are not statis- 
tically significant, however, except in the case of the work of 
the second quarter in the I Hour Division, heterogeneous sec- 
tion Y. Although the differences are not statistically signifi- 
cant the trend is so consistently in favor of the heterogeneous 
sections that it can be safely stated there is little chance for 
the occurrence of a reversal were the experiment to be re- 
peated under similar conditions. 

Of the sections in the two divisions, homogeneous section X 
is more nearly equal in achievement to heterogeneous section Z, 
than is homogeneous section W to heterogeneous section Y. 
In analyzing the ability of the four sections we see that homo- 
geneous sections W and X are slightly superior in ability as 
measured by C.A.T., but from the standpoint of previous 
achievement as measured by H.P.R. the heterogeneous sections 
Y and Z are slightly superior. In neither case are the dif- 
ferences large and it seems safe to conclude that insofar as 
these two measures are adequate criteria for equating the sec- 
tions they were of practically equal ability at the beginning 
of the experiment. 

It is obvious that in this study segregation of students of 
superior ability into small homogeneous sections did not 
lead to increased achievement but rather to slightly lowered 
achievement when compared with their peers in the large 
heterogeneous groups. 

In view of the claims made for the superiority of the method 
of segregating superior students into homogeneous classes by 
such men as Seashore (38) our results are at least surprising. 
This author, speaking of the advantages of sectioning on 
ability states, ‘‘A visitor to an ordinary college class will find 
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that one-fourth of the pupils are beyond the stage of instruc- 
tion and for them the class exercises serve as a deadening of 
their best sensibilities and enthusiasms; one-fourth of the class 
are not capable of comprehending or performing the task in 
hand but sit listless and helpless and rightly regard themselves 
as unjustly abused ; the members of the remaining half of the 
class present a variety of conditions, but most of them are 
capable of profiting to some extent by the exercise. The elimi- 
nation of this waste of teaching what is known in the highest 
quarter and teaching what is beyond the grasp of students in 
the lowest quarter will result in great economy.’’ He states 
further, ‘‘It becomes possible to apply in teaching>the peda- 
gogical maxim, which is the outcome of the discovery of the 
individual; namely, ‘keep each student at his highest level of 
achievement in order that he may be successful, happy and 
good.’’’ In his closing paragraph he states, ‘‘The plea for 
this method of sectioning on the basis of ability is thus pre- 
sented by one who is confessedly an enthusiast for the method, 
having used it for several years with large sections in psy- 
chology. . . . It would, however, not be unreasonable to say 
that opposition to the plan can, after all, be most fairly pre- 
sented only by one who has actually put it to experiment and 
has arrived at an adverse conclusion. At the present time I 
know of no one who has qualified for that task. Let us, there- 
fore, apply a principle of science, and, before we render a 
verdict on the plan, ‘try it.’’’ Such decided statements bear 
out the contention made earlier in the section, that some edu- 
cators have become enthusiastic about the value of sectioning. 
Dr. Seashore advises that we apply a principle of science, ‘‘try 
it,’’ but he did not go on to add, under ‘‘controlled condi- 
tions.’’ The plan has been tried, in this experiment, under 
conditions which have been carefully controlled but we have 
failed to obtain the decidedly positive results claimed for it 
by its advocates. 

The criticism may be raised against this experiment that the 
additional achievement made by the segregated groups was not 
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measured by the examinations employed, that only the factual 
materials of the course were measured. In answer to this ob- 
jection, we have pointed out that the basic content was the 
same for both segregated and non-segregated sections. We 
also pointed out in section III that before a student can form 
broad concepts and engage in relational thinking he must have 
fundamentals and facts from which to build concepts and to 
do relational thinking about. Consequently, if segregation is 
to lead to better learning and to the development of a broader 
type of thinking, it must follow that the segregated students 
will be taught in a better manner the basic materials upon 
which such thinking depends. Unless the system of segrega- 
tion does lead to superior learning of the basic materials of 
a course, it is difficult to see how it can function to advantage 
in other respects. It is evident from our findings that such 
superiority was not demonstrated for the method of sectioning. 

The charges levelled against the traditional method of het- 
erogeneous sectioning of students have stressed the point that 
such a system leads to boredom for the superior student. If 
this assumption is true, the homogeneous sections would pro- 
vide for a broader and more thorough treatment of psychol- 
ogy, would make possible the teaching of more useful and 
interesting materials, and cause students to more readily grasp 
psychological concepts, thus causing them to work up to the 
level of their ability. If this be true, we should expect stu- 
dents in the homogeneous sections to be more favorably im- 
pressed with the course in psychology than those taught in the 
heterogeneous sections. Similarly, in comparing psychology 
with other courses they have had, we should expect decidedly 
more favorable opinions of psychology from the students 
taught in homogeneous sections W and X than from those 
taught in heterogeneous sections Y and Z. To test this hy- 
pothesis, the responses made on the questionnaire by members 
of the four sections which form the basis of this experiment 
were analyzed. 
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The questionnaire affords us a direct approach to many of 
the arguments concerning the value of sectioning on the 
basis of ability. We have pointed out in the preceding sec- 
tions many of the weaknesses of the questionnaire method, 
and have shown how we have attempted to overcome such 
weaknesses. If there is anything that a student should have 
a definite opinion about it is whether or not he is bored with 
a course, and whether he likes one course better than an- 
other. The writers on the subject of sectioning have made 
such assertions as, ‘‘the students are bored,’’ ‘‘the students 
lack interest in their work because of its elementary nature,’? 
‘fobserve an ordinary college class, and one will find one- 
fourth of the students beyond instruction,’’ and similar 
statements. Most of these statements have been made in the 
most decided manner, and upon no stronger evidence than 
the personal opinion of the individual making them. In this 
study we have gone to the students, and asked them for their 
frank and honest opinion. In what better manner can we 
discover whether students prefer to be taught in homogene- 
ous groups or heterogeneous ones than to ask them? Why 
should we resort to personal speculation, when results of 
such subjective types of action invariably lead to arguments, 
misunderstandings and exaggerations? Seashore’s statement 
that the dull students ‘‘sit listless and helpless’’ was prob- 
ably an over-statement. While such enthusiasm on the part 
of so prominent an educator probably aroused wide-spread 
interest in the problems of meeting the needs of the individ- 
ual student, it is likely that it gave rise to a wide-spread and 
uncritical acceptance of the hypothesis before it was proved. 
The writer, in discussing this point with college instructors 
has found that the majority hold exactly the opposite opin- 
ion. Instructors report that the dull students have learned 
to sit up and look as if they were understanding what was 
going on as a technique which will make an impression upon 





a good many teachers. But since such evidence is in the 
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TABLE 13 
Table of condensed results dealing with the superior students’ opinions towards the course 
as a whole and its level of difficulty 











1 HOUR DIVISION III HOUR DIVISION 
No. of Question and Evaluation | Homo- Hetero- Homo- | Hetero- 
Question Used Herein geneous | geneous geneous | geneous 
sec.W | Sec. Y Sec. X | Sec. Z 
mae i ON 48 | N 54 N 53 
] If I had a brother or sister 


in next year’s sophomore 
class I would insist or rec- 
ommend that he take psy- 
chology. 90% O41 


87% 96% 
2 Comparing psychology with 
other courses I would rank 
it first, in the hishest one- 
fourth, above average in 
quality. 92% 96% 89% 100% 
3 Comparing psychology with 
other courses I would rank 
it first, in the highest one- 
fourth, above average in 
difficulty. 67% 19% 80% 72% 
4 Comparing psychology with 
the average college course, 
I rate it more, equally: 
a Difficult 75% 85% R24 68% 
b Provocative of thought 95% I8% 96% 96% 
ce Interesting 91% 93% 994 94% 
d Valuable for other college | 
courses | 79% 84% 92% 89% 
e Applicable to every-day 
life } 99% 98% 06% 99% 
7 I intend to take more psy- | 
chology 58% 65% 54% 55% 


same category as that which we have been criticising, let us 
proceed to our data. 

Table 13 is made up of a condensed summary of the ques- 
tionnaire findings. It contains the number and nature of the 
questions asked, with the percent of the various groups which 
answer them in a favorable manner. Questions 1, 2, 3 and 4, 
and 7, which are given in the table, all have to do with stu- 
dents’ opinions of the introductory course. It is apparent 
from the percent of favorable responses listed in the last two 
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columns of the table that both the homogeneous and hetero- 
geneous sections consider psychology in a very favorable 
light. There is no evidence that the students taught in the 
homogeneous groups consider psychology any more interest- 
ing or valuable than do students taught in the heterogeneous 
groups. In fact the reverse is true. The superior students 
in the large lecture sections composed of students of hetero- 
geneous ability are more favorably inclined towards the 
course than are the ones in homogeneous sections. Further- 
more, as can be seen from table 13 the responses to questions 
3 and 4a which have to do with the relative difficulty of psy- 
chology in comparison with other college courses, the supe- 
rior students think psychology is a difficult course. These 
findings substantiate those found previously when we were 
discussing achievement. The results, then, are the same in 
both objective and theoretical measurement. The students’ 
grades do not show that segregation increases achievement 
and their replies show that they do not prefer segregation. 
In considering the various practical methods of organizing 
the introductory course it was found that the homogeneous 
sections W and X rank small sections first, lecture-quiz sec- 
tions second and large lectures third. The heterogeneous 
section Y ranks lecture-quiz first, lecture second and small 
sections third, and heterogeneous section Z ranks large lec- 
tures first, lecture-quiz second and small sections third. It 
seems the superior students prefer to be in the type of sec- 
tions in which they found themselves at the time of answer- 
ing the questionnaire. The heterogeneous groups would pre- 
fer small sections if they could be sure of instruction of as 
high quality as they are receiving in the large sections. But 
the fact that the superior students who are receiving instruc- 
tion in the large sections realize that in the practical situa- 
tion it is better to be a member of a large group taught by 
good instructors than to take a chance on small sections 
where the ability of the instructor is doubtful, suggests that 
superior students believe efficient teaching can be done in 





re Se eee 


Be ARK SRR» 





LEARNING IN GENERAL PSYCHOLOGY 159 


large classes. This conclusion is based upon the results ob- 
tained from heterogeneous sections Y and Z, the control 
sections. These results are probably more valid than those 
obtained from homogeneous sections W and X, as there are 
probably a number of students in these latter sections who 
have had no experience with large sections, whereas, in het- 
erogeneous sections Y and Z, the control groups, they have 
all had experience with small sections in other courses and 
are having experience with the large lecture method in the 
course in psychology; consequently, they have a better basis 
for comparison. 

Another claim made by the advocates of the method of 
sectioning on ability is that the superior students do not like 
heterogeneous sections, because they are held back due to the 
inferior ability of their classmates. Question 10 of the ques- 
tionnaire deals with this point. It requests the students to 
indicate the nature of the group in which they prefer to 
work. Analysis of the questionnaire indicates that three of 
the four sections show a large percentage of their members 
preferring the heterogeneous sections. Heterogeneous sec- 
tion Z is about equally divided concerning this point. It 
seems from these findings that the superior students prefer 
to be taught as members of sections made up of students of 
various ranges of ability. These findings do not substantiate 
the hypothesis that students prefer to work with members of 
their own level of ability. 

One other consideration can be evaluated from our resuits 
of the opinions of superior students toward the course in in- 
troduectory psychology—their attitude towards the content 
of the course. The findings on this question, question 6, in- 
dicate that the order of preference for various topics of the 
course is much the same for both the homogeneous and het- 
erogeneous groups. Furthermore, the ranks are much the 
same as those found in section II for the whole group of stu- 
dents who took psychology during 1928, 1929 and 1930. The 
more practical subjects are ranked higher than the theo- 
retical. 
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One finding stands out as a result of the above analysis of 
the questionnaire results: there are no differences of signifi- 
cant magnitude found between the students taught in homo- 
geneous and in heterogeneous classes to indicate that homo- 
geneous grouping is superior to heterogeneous grouping. 

A third approach to the question of whether there is any 
difference in the value of the two methods of teaching under 
discussion is that of correlation. When achievement is corre- 
lated with ability and again when achievement is correlated 
with previous performance in college further light is thrown 
upon our problem. These are reproduced in table 14. The 
correlations between achievement and C.A.T. are low as would 
be expected because of the homogeneity of the group. Both 
homogeneous and heterogeneous sections behave in about the 
same manner when correlated with this criterion. The corre- 
lations for the first quarter’s work show performance less 
highly related to ability than those of the second. Although 
the coefficients reveal little difference between the perform- 
ance of the two groups, they do seem to indicate that some 
motivating factor is at work the second quarter that causes an 
increase in the magnitude of the relationship herein pre- 
sented. This would seem to indicate that the students are 
more nearly reaching their potential level of achievement than 
they were the first quarter. Since it has been pointed out that 
the work of the second quarter is considered slightly more in- 
teresting than that of the first, it may be that this increase in 
interest is the motivating factor; but, whatever it is, it does 
not operate differentially in favor of the segregated sections 
as the increase in magnitude of the correlations is present in 
all sections. 

Analysis of the scatter plots did not reveal any significant 
difference in the performance of the students taught by the 
two methods. One fact was apparent from this analysis, 
none of the sections are working up to their maximum ability. 
This was pointed out in section III also, and is only addi- 
tional proof of the statement that one of the reasons for low 
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TABLE 14 
Ticis table shows correlations between final grades earned at the end of 
each quarter and college abiity test scores and honor point ratio 





EXPERIMENTAL GROUPS CONTROL GROUPS 
Homoge goewe Siinatiiainiann Het terogeneous Heterogeneous 
- Sec. X Sec. Y Sec. Z 
N = &2 N 48 N 52 
we ae r 
Grade Grade | — Grade | Grade Grade Grade Grade 
| oF 2w | 2W iF 2W iF ow 
C.A.T. .06 2 | 20 208 | .00 a ee Rs 
| 
| 64 57 | .49 50 53 46 


H.P.R | .o9 48 


correlation between ability test scores and achievement scores 
is lack of maximum motivation. 

When previous achievement is used as the criterion with 
which to compare achievement in psychology, we find some 
rather interesting results. Since honor point ratio represents 
the student’s achievement in heterogeneous classes, it would 
seem that the grades of students taught in homogeneous classes 
might correlate lower with this measure than those taught in 
heterogeneous classes. Especially would this be true if the 
claims commonly made for homogeneous classification were 
true. That no such results were obtained is apparent from a 
study of table 14; in fact the opposite is the case. Grades of 
students in the heterogeneous classes correlate lower with 
H.P.R. than do those taught in the homogeneous sections. 
Analysis of the scatter plots showed that there was no decided 
tendency at work in causing this difference, although there 
was a slight tendency for the experimental sections to show a 
wider variability. 

Results achieved by this method of analysis furnish results 
similar to those obtained by the other methods employed in 
this investigation; namely, there does not appear to be any 
very great difference in the achievement in introductory psy- 
chology when students are segregated in homogeneous sections 
composed of students of high native ability and as compared 
with leaving them in large sections of students of varying 
levels of ability. 
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SUMMARY AND CONCLUSIONS 


1. When students are taught by the all-lecture method and 
the lecture-quiz method there is in introductory psychology 
no statistically significant difference in achievement, as mea- 
sured by the examinations described in this section. 

2. We have assumed that grades in the introductory course 
in psychology reflect the quality of instruction in that course 
and that scores on the College Ability Test indicate with fair 
accuracy ability for college work. On the basis of these as- 
sumptions correlations computed between grades earned in 
psychology and college ability test demonstrate no relative 
superiority for either the lecture or lecture-quiz method of 
teaching psychology. 

3. The correlation between expected achievement as mea- 
sured by the equation score and achievement for the sections 
taught by the lecture-quiz method is no higher than for the 
group taught by the straight lecture method ; hence there is no 
indication that one method of teaching is any more effective 
than the other. 

4. Student opinions are slightly more favorable toward the 
general value of the introductory course in psychology when 
taught by the lecture method than when taught by the lec- 
ture-quiz method. 

5. Students favor methods of teaching involving large lec- 
ture sections to the traditional small sections taught by the 
lecture-discussion method. 

6. It was found that students taught by the lecture-quiz 
method learn slightly more than those taught by the all-lec- 
ture method, but conversely those taught by the straight lec- 
ture method retain slightly more of what they have learned. 
In both situations the differences are small, hence we may 
conclude that from the point of view of amount learned and 
amount retained, both methods are of equal value as systems 
of teaching introductory psychology. 

7. The methods of teaching herein discussed and the system 
of examinations used are influential in producing overlearn- 
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ing, as shown by the fact that both divisions forget only a 
small per cent of what they had learned after a lapse of three 
months. It is recommended that these methods be continued 
and supplemented with comprehensive examinations to cover 
learning and retention in larger units. 

8. All the evidence found in this investigation indicates 
that the lecture-quiz method of teaching psychology is very 
little, if any, better than the method of all-lecture. In view 
of these facts it is recommended that the lecture method be 
adopted in teaching this course owing to its greater economy. 

9. Students of superior native ability, when chosen on the 
basis of College Ability Test scores and Honor Point Ratio, 
tend to learn slightly less introductory psychology when 
taught in small sections composed of students of high ability 
than when taught as members of a large group made up of 
a wide range of ability. 

10. When student opinion towards the course is studied we 
find that students in heterogeneous classes consider the general 
course in psychology in a slightly more favorable light than 
do those taught in homogeneous sections, although the differ- 
ence is possibly negligible. 

11. Both the homogeneous and heterogeneous groups con- 
sider the course very favorably in comparison with their other 
college courses. They also consider it a more difficult course 
than the average course which they have taken on this campus. 

12. Superior students rank the topics of the course content 
in about the same order when taught in heterogeneous groups 
as they do in homogeneous groups. There is little difference 
in the order of importance of the content of the introductory 
course when rated by superior students and the student group 
as a whole. 

13. About three-fourths of the superior students prefer 
their classmates to have varying degrees of ability. 

14. In general superior students rank highest the type of 
class organization in which they are enrolled. But superior 
students who have had the introductory course in psychology 
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taught in large sections would prefer large classes if well 
taught rather than take a chance on a poor instructor in small 
sections. 

15. There are no significant differences found in the magni- 
tude of correlations between achievement and ability, and be- 
tween achievement and past performance. There does seem 
to be a factor at work the second quarter which is acting as a 
motivating influence in that the correlations between ability 
and achievement are somewhat higher the second quarter 
than they are the first. 

16. When measured by the instruments used in this study 
there are few differences found between superior students 
taught as members of large heterogeneous sections and as 
members of small homogeneous groups either in achievement 
or in their subjective opinions. 
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THE INFLUENCE OF FORM OF TYPE ON THE 
PERCEPTION OF WORDS 


MILES A. TINKER 


University of Minnesota 


There are three forms of type that are commonly employed 
in printing: lower-case (small letters), italics, and all capitals. 
Printing in lower-case is most common, of course, but the 
reader frequently encounters whole paragraphs and even 
pages printed in italics or in capitals. Most individuals pre- 
fer to read material printed in lower-case. The subjective re- 
action of annoyance in reading text in capitals or italics may 
be significant for this material is read appreciably slower 
than text printed in lower-case type. Tinker and Paterson’ 
have demonstrated that speed of reading depends to a consid- 
erable degree upon form of type. They discovered that text 
in lower-case was read 13.4 per cent faster than material in 
eapitals, and 2.8 per cent faster than italicized text. The di- 
rection of these differences was established with a high degree 
of certainty.” 

The present investigation is concerned with words printed 
in all capitals and in lower-case type only. At least three 
factors probably operate to produce faster reading in lower- 
ease than in all capital text: (1) material printed in all cap- 
itals covers approximately 35 per cent more space than the 
same text in lower-case letters. This would tend to decrease 
speed of reading the material in capitals for Tinker* has 

1M. A. Tinker and D. G. Paterson, Influence of type form on speed of 
reading, J. Appl. Psychol., 1928, 12, 359-368. 

2 When the formula for correlated measures is employed, as should 
have been done in the original report, = for lower-case and italies be- 

D 
comes 4.47. 


3M. A. Tinker, Numerals versus words for efficiency in reading, 
J. Appl. Psychol., 1928, 12, 190-199. 
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shown that the number of pausts and time of reading is in- 
creased when the same content is spread over a larger printing 
space. (2) Material printed in all capitals produces a novel 
or less familiar reading situation since the ordinary individ- 
ual is concerned largely with material in lower-case type. 
This lack of familiarity may produce slower reading. (3) 
Several investigations indicate that total word-form is an im- 
portant factor in perceiving words. Because word-form ap- 
pears to be more characteristic with words printed in lower- 
ease than with those in capitals definiteness of word-form is 
possibly involved in the faster reading of text in lower-case 
type. 

The purpose of the present study is to investigate the influ- 
ence of form of type on the perception of words by comparing 
in three ways words and letters printed in lower-case and in 
capitals: (1) perceptibility of words; (2) perceptibility of 
letiers, and (3) definiteness of word-form. 

Six students majoring in psychology served as subjects in 
the experiment. All subjects had emmetropic or adequately 
corrected vision. 


APPARATUS AND PROCEDURE 

The perceptibility of words and letters was determined by 
the distance method, employing an apparatus which has been 
described in a previous report. This apparatus consists 
mainly of an extended bench along which moves a sliding ear- 
riage holding the words or letters to be read. One end of the 
bench is attached to a small table which carries a head-rest. 
The material to be read is uniformly illuminated with an in- 
tensity of 35 foot-candles per square foot. 

The stimulus material consisted of 105 familiar five-letter 
words. Many groups of two or more words with similar total 
form were included, as ‘‘there’’ and ‘‘these.’’ To form five- 
letter series of nonsense letters in which occur the identical 


4M. A. Tinker, The relative legibility of modern and old style numer- 
als, J. Exp. Psychol., 1930, 13, 453-461. 
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letters employed in words, the letters from 40 of the 105 words 
were put in reverse order as ‘‘evoba’’ from ‘‘above.’’ All 
material (lower-case and capitals) was printed in ten-point 
type, Scotch, on white cardboard. The words were arranged, 
well separated from each other, three words per line and five 
lines per 83 x 11” sheet of cardboard. Six groups of unre- 
lated letters were placed on each sheet with the exception of 
one which had four. This yielded four groups of stimulus 
material: (1) 105 words in eapitals; (2) the same 105 words 
in lower-case; (3) 40 groups (200 letters) of unrelated letters 
in capitals, and (4) 40 groups (200 letters) of letters in lower- 
case. 

In arranging the material for observations care was taken 
to distribute practice and fatigue effects equally. Each group 
of words and unrelated letters was read once by each of the 
six subjects. 

The experiment was conducted in a semi-darkened room. 
Observing was begun with the stimulus sheet in the carriage 
at the far end of the bench. After each reading the carriage 
was moved 20 em. nearer to the subject. This was continued 
until all words or letters on the sheet were apprehended cor- 
rectly. (Details of procedure are given in the earlier report, 
see footnote 4.) The experimenter recorded mistakes and the 
distance from the subject at which each word or letter was 
read correctly. 


RESULTS AND DISCUSSIONS 

The basic data of this investigation are given in Table 1. 
In the top section are the mean and standard error of the 
mean for letters in nonsense series and for words in lower- 
case type; in the bottom sections are the data for all capitals. 
The means represent the average distances from the subjects’ 
eyes at which the words or letters were read corretly. The 
fourth and fifth columns contain the data for apprehension 
of the 40 words from which the 40 series of unrelated letters 
(columns 2 and 3) were derived. This permits a comparison 
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The Perception of Letters in Nonsense Series and Words when Printed 
in Lower Case and Capitals. 
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{1) (2) (3) (4) (5) (6) | (7) 
H 184.8 2.7 192.5 5.3 194.5 | 2.9 

I 175.3 2.7 186.5 5.8 185.1 | 3.0 

M 147.5 2.2 154.5 3.9 155.6 | 2.1 
G 135.1 2.5 136.5 4.7 137.7 | 2.9 

s 143.8 2.4 154.0 4.9 1522 | 24 
WwW 201.9 2.9 213.5 6.5 213.9 | 3.5 
Total* 164.6 1.3 173.4 2.8 173.2 | 1.6 

CAPITALS 

H 263.1 3.5 268.0 5.3 | 276.6 | 3.3 

I 244.9 3.4 243.0 4.6 | 244.0 | 2.8 

M 218.2 2.9 227.5 4.7 | 222.7 | 29 

} 189.6 2.6 190.0 5.3 | 1895 | 3.0 

Ss 195.4 2.6 205.5 45 | 204.0 } 2.5 
Ww 290.5 3. 293.5 3.7 | 292.9 | 3.0 
Total* 233.3 1.7 238.8 3.0 | 239.9 | 1.9 

















* Average and o’s for total group are computed from original data. 


of letters with the identical words from which they were de- 
rived as well as with the entire group of 105 words. 

For the lower-case printing the total average of 173.4 em. 
from the eye at which the 40 words were correctly read is 
almost identical to the average of 173.2 em. for 105 words. 
Also the mean for each subject is approximately the same 
whether 40 or 105 words are read. Similar trends are reported 
for all capitals where the total averages are 238.8 and 239.9 em. 
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respectively. Therefore, letters will be compared with both the 
large and small group of words since the former is representa- 
tive of the latter, and yields a more stable mean. 

A summary of important comparisons from Table 1 are 
given below: 


ry ‘ D 
1, L.C. words (105) vs. words in cap. (105): D = 66.7, — = 26.89 
Cc 
D 
° ¥ 2 ad D — ry 
2. L.C. nonsense vs. nonsense in cap.: D = 68.5, — = 29.02 
co. 
: D 
3. Lower-case: nonsense vs. 40 words: D= €4,——2 824 
Cy, 
- me 
4. Lower-case: nonsense vs. 105 words: B= G€4,-~—=: 438 
Ty 
- wes D 
5. Capitals: nonsense vs. 40 words: p= £4:-"2 357 
Cy 
— sie . a » 
6. Capitals: nonsense vs. 105 words: p= €5,—~—=<= 371 
o 
D 


Reference to Table 1 reveals that the direction of the differ- 
ences found in these comparisons of group averages hold with- 
out exception for every subject in the group. Comparison 1 
and 2 show that both words and letters in capitals are read 
accurately much farther from the eye than when printed in 


_ = , 
lower-case type. The ratios — indicate that the direction of 


D 

the differences is reliably established. Although both ecapi- 
tals and lower-case were printed in ten-point type, the capital 
letters do have larger outlines which probably explains these 
results. 

Lower-case letters in nonsense arrangement are read cor- 
rectly at 8.8 (or 8.6) cm. nearer the subject’s eyes than words 
in the same form of type. The direction of this difference is 


; ?; ; D i 
also reliably established as shown by the size of —. A similar 
D 


comparison for all capitals reveals a difference of 5.5 (or 6.6) 
em. in favor of the words also. This difference which is 
smaller (absolutely and comparatively) than for lower-case 
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aati , D 
means that its direction is very unstable since [~ equals only 
D 


1.57 (or 2.71). It appears therefore that words in all capi- 
tals are read much like groups of unrelated letters, 1.e., by 
letters rather than by word-form. With words in lower-case 
this tendency to read by letters is comparatively much less 
pronounced, indicating greater potency of word-form in per- 
ception. It is likely, however, that word-form is active to 
some degree in reading words in capitals, for these words are 
consistently read at a greater distance from the subject than 
are the unrelated letters in the same form of type. 

Originally the order of letters in the 40 words was reversed 
to form nonsense series in order that recognition of a word 
would not automatically lead to spelling it correctly when at- 
tempting to read the letters. As a check on the above find- 
ings one trained observer (the writer) served as subject when 
the stimulus material consisted entirely of words. Both the 
words and the separate letters of the words were to be identi- 
fied. The results correspond to those reported above but it 
was very difficult for the subject to prevent knowledge of a 
word from influencing the perception of its constituent letters. 

The misreadings made by the subjects furnish further evi- 
dence that word-form is more potent in perception of words 
in lower-case than in capitals. Below are the number of 
wrong words given during the reading of 105 words: 


Subject H Capitals 62 Lower-case on a 
Subject I Capitalls.............. 64 Lower-case . 88 
Subject M Capitals . 45 Lower-case a a 
Subject G Capitals 85 Lower-case 101 
Subject S Capitals . 45 Lower-case 59 
Subject W Capitalls.............. 64 Lower-case 92 
Average Capitals . 60.8 Lower-case 79.7 
Difference between averages equals..............18.9 


The consistency of these results is striking. For every sub- 
ject there were many more misreadings in lower-case than in 
all capital printing. In reading 105 words, 18.9 more wrong 
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words were given in lower-case than in all capitals on the 
average. Total word-form often became apparent before 
enough letters were perceived to apprehend the word cor- 
rectly. This led to a response which in many instances was 
not the correct word, but one of similar total form. It was 
obvious to the subjects that words were being identified while 
some of the letters were still unidentifiable. Therefore the 
greater number of misreadings for words in lower-case type 
indicates that word-form is more influential there than in 
perceiving words in capitals. 

The findings of this study have an important bearing on 
printing practice. Where smooth and rapid reading is de- 
sired, lower-case type should be employed. However, the 
material should be in all capitals where perceptibility at the 
greatest possible distance from the reader is essential and 
quickness of apprehension is of minor concern. 


SUMMARY AND CONCLUSIONS 

1. The perceptibility of words and groups of unrelated 
letters when printed in lower-case and in all capitals was ob- 
tained by means of the distance method. 

2. Both capital letters and words in capitals were read at 
greater distances from the subject than letters or words in 
lower-case. 

3. There was only a small and statistically insignificant dif- 
ference between distances at which unrelated capital letters 
and words in capitals were read correctly. With lower-case 
type, however, the difference between distances for apprehend- 
ing words and unrelated letters is greater and the direction 
of difference is very stable statistically. These findings indi- 
cate that total word-form is more potent in the preception of 
words in lower-case than in all capitals where perception seems 
to oceur largely by letters. 
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4. The above conclusion is supported by the fact that the 
reading of words in lower-case yielded more misreadings than 
the words in capitals. The incorrect word frequently had a 
total form similar to the stimulus word, especially in lower- 
case printing. 

5. The greater influence of total word-form on perception 
of words in lower case in comparison with words in capitals 
is an important factor contributing to the faster reading of 
text in lower-case type. 











A STUDY OF FATIGUE IN THREE-HOUR COLLEGE 
ABILITY TESTS 


VICTOR H. NOLL1 
Office of Education, Washington, D. C. 


The problem of the measurement of fatigue in human sub- 
jects is ordinarily approached by either of two methods. 
The first, which may be ealled the direct method, involves the 
measurement of efficiency in work continued without inter- 
ruption over a period of time, fatigue being measured in terms 
of decreased efficiency in the task itself. Many experiments 
of this type have been made, the results in most cases indicat- 
ing that if the task is continued long enough, a greater or less 
decrease in efficiency takes place depending on the difficulty 
of the work, the length of time it is continued, the motiva- 
tion of the subjects, ete. 

The second method of measuring fatigue may be ealled the 
indirect one. In this method, the subjects are kept at a task 
for some time, efficiency being measured before and after by 
end tests on material which is more or less similar to the mate- 
rial of the main task. One advantage of the indirect method is 
that in experiments of this type it is somewhat easier to main- 
tain the interest of the subjects than in experiments making 
use of the direct method since the test material usually differs 
somewhat from the practice material. The chief weakness of 
most reported experiments on the measurement of fatigue lies 
in the fact that it is very difficult to determine whether a 
measured decrease in efficiency is due to actual fatigue or loss 
of interest. 

The present report deals with two experiments conducted at 
the University of Minnesota. In these experiments the in- 


1 Member of the Staff of the National Survey of Secondary Education, 
Office of Education, Washington, D. C. 
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direct method was used. The subjects were students who had 
spent one or two years in a Minnesota State Teachers’ Col- 
lege and were transferring to the College of Education of the 
University of Minnesota. In making this transfer these per- 
sons are required by the College of Education to take a battery 
of College Ability Tests. 

This battery of tests requires about three hours of prac- 
tically continuous work on the part of each individual, no in- 
terruption being permitted excepting that which occurs in dis- 
tributing and collecting the various tests and materials. These 
include the Miller Analogies Test,? a test consisting of one 
hundred difficult analogies with a time limit of forty minutes; 
the Minnesota Reading Examination’ consisting of a vocabu- 
lary test and a reading comprehension test requiring a total 
of forty-six minutes; the Meier-Seashore Art Judgment Test* 
involving judgments as to the comparative merits of each one 
hundred and twenty-five pairs of pictures; and the filling out 
of a personal and family history folder. There is no time 
limit on the Meier-Seashore Art Judgment Test, but it usually 
requires about forty-five minutes and about the same amount 
of time is required for the personal history folder. 

The subjects of these experiments took this battery of tests 
and their efficiency was measured immediately before and 
after this testing by the Peterson Uniform Equation Comple- 
tion Test,® of which 6+4=12 2 is a sample item. The task 
in this test is to supply the missing +,-—,x, or + sign as the 
ease may be. The test consists of hundreds of similar equa- 
tions which are supposedly of uniform difficulty. The subjects 
were given twenty-five minutes of practice on the Peterson 
Test at the beginning and fifteen minutes on a new sheet at 


2Miller, W. 8S. Analogies Test. Unpublished. 

3’ Haggerty, M. E., and Eurich, A.C. Minnesota Reading Examination. 
Minneapolis. University of Minnesota Press. 1930. 

4 Meier, N. C., and Seashore, C. E. Art Judgment Test. Iowa City. 
Bureau of Educational Research and Service. 1929. 


5 Peterson, J. C. Uniform Equation Completion Test. Unpublished. 
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the end. At the end of each five minutes of work on this test 
the subjects were asked to draw a line under the last equation 
completed. 

Two experiments were conducted with different groups on 
different days, both between the hours of one and five P. M. 
The technique in both experiments was identical except for 
one factor. In the first experiment after finishing the battery 
of College Ability Tests and just before beginning the final 
fifteen minutes of work on the Peterson Test the subjects were 
told that their standing on the test battery would depend in 
part on the amount of improvement shown on the Peterson. 
In the second group no statement was made other than that 
they were being asked to work again for a short time on the 
Peterson. 

The purpose of these experiments was not to determine 
whether fatigue could be induced but to determine whether 
these subjects were measurably less efficient after taking a 
battery of tests such as have been described. It was believed 
that the test material was sufficiently interesting and the sub- 
jects motivated enough to keep them working at maximum 
efficiency for the full three hours. Hence the problem of 
maintaining interest and maximum expenditure of mental 
effort was believed not to be a factor in these experiments. 

The above described procedure provided two situations in 
which the efficiency of the subjects was determined by pre-test, 
after which they worked for three hours on the usual type of 
college ability test material, and then efficiency on the same 
type of test as was used in the pre-test was again determined. 

In the first group, hereafter designated as the ‘‘motivated’”’ 
group, there were forty-four individuals; in the second or 
‘‘unmotivated’’ group there were twenty. 

In Table I are shown average scores by five-minute periods 
of work on the Peterson Test. The score on this test is the 
number of equations correctly completed regardless of the 
number attempted. Except for one or two of these periods 
for each group, there appears to be a small but regular in- 
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erease in score from the first period to the last. The interval 
of three hours of testing between Periods V and I does not 
appear to have caused any decrease in efficiency in either 
group. It is interesting to note that the supposedly motivated 
group shows a slightly greater total gain from the first five- 
minute period to the last and a gain following the testing 
period which is more than three times that made by the un- 
motivated group over the same interval. 
TABLE I 


Scores on Peterson Equation Completion Test by five-minute periods 
(Averages for all individuals in each group) 





I II III IV Vv I II III N 














Motivated 20.2 24.2 22.0 24.7 26.3 269 283 30.4 44 
Unmotivated .... 21.3 22.6 23.9 22.9 27.1 29.2 27.1 32.4 20 
Average score for Average score for 
25 minutes 15 minutes 
Motivated ........... 23.5 28.5 
Unmotivated .... 23.6 29.6 
Average score for Average score for 
10 minutes 10 minutes 
Motivated ........... 25.5 29.2 
Unmotivated ....... 25.1 29.8 





The average score for the first twenty-five minutes of work 
on the Peterson is practically the same for both groups, but 
the unmotivated group has a slightly higher average score 
than the motivated group on the final fifteen minutes of work 
on this test. 

If the first fifteen minutes of the preliminary testing and 
the first five minutes of the final testing on the Peterson be 
disregarded (as warming-up periods), it is possible to compare 
the average scores of the groups on similar ten-minute periods 
of work on the Peterson. These comparisons, also shown in 
Table I, indicate that the two groups are practically identical 
in average scores made in each of these ten-minute periods. 
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If the average scores of each individual for each of these 
two ten-minute periods are considered, it is possible to deter- 
mine the number of individuals who gained, remained the 
same, or lost. That is, if an individual’s average score on the 
last ten minutes of the final testing on the Peterson exceeds 
his own average score on the last ten minutes of the pre- 
liminary testing, he is classed among those who gained; if his 
average score on the final ten minutes is less than his average 
on the last ten minutes of the preliminary twenty-five minutes 
of work on the Peterson, he is designated as one who lost. 
These comparisons indicate that of the forty-four persons in 
the motivated group, thirty-three, or seventy-five percent, 
gained and eleven, or twenty-five percent, lost or remained 
the same. (Three neither gained nor lost.) Of twenty per- 
sons in the unmotivated group, eighteen, or ninety percent, 
gained and two persons, or ten percent, lost. (All either 
gained or lost.) These results are shown in Table II. 

TABLE II 
Numbers and percents who gained, lost, or remained the same on the 
Peterson Equation Completion Test. Comparisons of last ten 
minutes of final testing with last ten minutes 


of Preliminary Testing 





LOST OR REMAINED 
GAINED 








THE SAME 
No. Percent No, | Percent - 
Motivated ..... | 3 | wo | un | 2s 
Unmotivated | 18 90 2 10 


6 iE Se Tee 


Here again there appears to be no striking difference be- 
tween the groups. If the three persons in the motivated group 
who neither gained nor lost are disregarded, the percentage 
who lost is reduced to eighteen percent of the total. It is 
likely that the differences which appear here between the 
motivated and the unmotivated groups are not significant, al- 
though the number of cases involved is too small to calculate 
the statistical reliability of these differences. 
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The three hours of work by the subjects of these experi- 
ments on a college ability test provided measures of their in- 
telligence. It will be recalled that part of this battery of 
college ability tests consisted of an analogies test and a read- 
ing test. The scores on these tests are expressed in terms of 
percentile ranks based on large numbers of scores of indi- 
viduals who have previously taken the same tests. Compari- 
sons of the two experimental groups with respect to percentile 
‘anks on these tests are shown in Table ITI. 


TABLE III 
Comparisons based on percentile ranks of individuals in college ability 
tests 


AVERAGE PERCENTILE RANKS 








Miller Analogies Minnesota Reading 





Motivated oS 40.3 50.4 
Unmotivated ... 46.9 49.5 





Those who gained Those who lost 














M.A. M.R. M.A. | M.R 
Motivated .. 44.2 54.4 28.4 | 38.7 
Unmotivated ... | 48.3 51.7 35.0 | 30.0 


| ; 





The average percentile ranks on the Miller Analogies Test 
indicate that both groups are somewhat below the median for 
similar groups on whom the norms are based. (A percentile 
rank of fifty is taken as the standard average.) The average 
percentile ranks on the Minnesota Reading Examination indi- 
cate that the two groups are about average in their reading 
ability as measured here. 

All of the subjects in both groups are next considered in- 
dividually and separated into those who gained and those who 
lost as described above. Those who neither gained nor lost 
were eliminated from the comparisons which follow. There 
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were but three such persons, all in the motivated group. On 
both the Analogies Test and the Reading Examination, the 
average percentile ranks of those who gained on the Peterson 
are distinctly higher than the average percentile ranks of 
those who lost, as shown in Table ITI. 

In order to determine the reliability of these differences, 
the percentile rank on the Analogies and that on the Reading 
test was averaged for each individual and the average of these 
combined percentile ranks for those who gained was com- 
pared with a similarly obtained average for those who lost. 
These comparisons are shown in Table IV. 

TABLE IV 
Reliability of differences between averages of combined percentile ranks 
on two college ability tests 


Average of combined 





P.R,s on Miller Diff. 
Analogies and_ Difference S.E.p ie. — 
Minnesota Read- S.B.5 1 2¢ 
ing Tests 
Those who gained 49.5 16.3 9.54 1.7 
Those who lost 33.2 


The average percentile rank for those who gained exceeds 
that of those who lost by 16.3, a difference which is 1.7 times 
its standard error. This represents approximately ninety-one 
chances out of one hundred that the difference is a true differ- 
ence greater than zero and in the direction indicated. 

A further indication of the difference in intelligence between 
the two groups is the bi-serial coefficient of correlation between 
those who gained and those who lost. This coefficient is 
.33 + .17 which indicates a not very large or reliable relation- 
ship between gaining or losing on the Peterson and intelligence 
as measured here. 

It is of some interest to note that the correlation between 
the Miller Analogies Test and the Minnesota Reading Exam- 
ination which has often been determined in various groups 
larger than the present ones is usually about .60 while that 





ee 








































182 VICTOR H. NOLL 


between either of these tests and the Peterson has rarely been 
found to exceed .35. 


SUMMARY 


As a result of these experiments the following statements 
seem justifiable : 

1. Neither of the experimental groups showed any loss in 
efficiency as measured by scores on the Peterson test after 
three hours of the type of work described. On the contrary, 
whether we consider all of the five-minute periods or the last 
two before and after testing, most of the students appear to 
be more efficient after testing than before. 

2. An attempt to motivate one group even more than they 
already had been by the testing situation seemed to have had, 
if anything, a negative effect on the persons in that group. 
It is possible that the group was discouraged rather than 
motivated by the statement made. This may indicate that 
they felt that they had already given their best and that to 
expect any improvement at that stage was unjust. 

3. When the intelligence of those who seemed more efficient 
after testing than before was compared with the intelligence 
of those who lost or remained the same in efficiency, it was 
found that the latter were distinctly lower than those who 
gained. Statistical criteria of the reliability of this difference 
indicate that it is probably not a highly significant one. The 
small number of cases involved makes it rather hazardous to 
draw any conclusions based on the obtained difference. 

4. The intelligence of the groups appeared to be slightly 
below the average of similar groups previously tested at en- 
trance to the College of Education. 

5. It seems evident that it may be quite safe to require 
college students to devote at least as much time as is ordinarily 
done to examination periods without fear either of doing them 
an injustice from the point of view of measurement or of 
tiring them unduly. The chief problem seems to be not one 
of avoiding fatigue or loss of efficiency but one of keeping the 
subjects in a good humor. 
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6. These experiments show no results which are startling, 
or very different than could have been foreseen. They do, 
however, provide some objective evidence on the question of 
what a reasonable length of time for such examination periods 
may be. Usual practice in institutions of higher learning is 
to limit final examinations to two hours; also to permit special 
examinations for students who have more than two examina- 
tions scheduled for the same day. The latter, of course, is 
done at least as much for the purpose of giving the student a 
fair chance to prepare for such examinations, as for avoiding 
fatigue. In general, nevertheless, the aim is to restrict exami- 
nation periods to a length which will not unduly tire the sub- 
jects. There is no evidence here to substantiate the belief that 
a three-hour period of intensive work on materials similar in 
nature to those used in these experiments has any measurable 
effect on the efficiency of persons who participate. It is true 
that a small percentage of the individuals in both experimental 
groups seemed slightly less efficient after three hours of work 
than before. However, the percentages were small and the 
losses measured both in individual cases and collectively, were 
insignificant. 

It is possible that the similarity in nature of the materials 
in the end tests and in the college ability test battery had some 
bearing on the results. That is, the three hours of work on 
the test battery may simply have served to increase the par- 
ticular ability to do the type of task involved in the Peterson 
Test without increasing the all-around or general efficiency of 
the subjects. It may be said in answer to this criticism that 
in the abilities measured here no general or large decreases in 
efficiency were found. 

It may also be that the testing period keyed the subjects to 
a pitch which served to keep up their efficiency beyond the 
time when they would ordinarily have become aware of fatigue 
or let-down. If this be true, it would be interesting to conduct 
other experiments similar to those described here, and then 
to test the subjects a short time later after they had relaxed 
from the tension of the examination. 
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PURPOSE 


The purpose of this investigation is to determine -~whether 
differences in educational age are to be attributed chiefly to 
differences in mental age, as measured by a group intelligence 
test, or to differences in the amount of training, as measured 
by location in the grades. If we assume that differences in 
mental age are primarily the result of differences in native 
learning ability or learning capacity, then our purpose is to 
find whether variation in school achievement is caused mainly 
by variation in learning capacity or variation in the amount 
of school training. 

If someone were able to show that learning capacity is by 
far the more potent in explaining differences in achievement 
this would not mean that training was unnecessary, as neither 
learning capacity nor training can be dispensed with in ac- 
quiring ability to achieve. It would simply show what degree 
of ability in achievement was to be expected when learning 
capacity and training cooperated in different amounts. 

The determination of the relative potency of training and 
learning capacity in develcping ability to achieve is of the 
greatest practical importance. If learning capacity is most 
potent, then educational guidance programs and types of 
school organization which attempt to fit the school tasks to 
the learning capacity of the child are of considerable signifi- 
eance, but if learning capacity has relatively little influence 
upon achievement in comparison with training, then our gui- 
dance programs and the types of school organization which 
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aim to fit the work of the school to the child’s learning ¢a- 
pacity are largely useless. If training is all important then 
any child can be taught to learn anything? and any kind of 
superior achieyement can be secured through training. More- 
over, if training has by far the greatest potency in developing 
ability to achieve, then the eugenic measures which are under- 
taken to get rid of the feebleminded are unnecessary. The 
solution of our problem then has a vital bearing upon the im- 
portance of eugenic measures, guidance programs and types of 
school organization. 


PROGRAM 

The general plan of the investigation is to compare the 
average of the educational age differences of the children who, 
being in the same grade, differ two years in mental age, with 
the average of the educational age differences of the children 
who, having the same mental age, differ two years in grade 
location. In the first case mental age differs two years while 
the grade is held constant, and in the second case the grade 
location differs two years while mental age is held constant. 
When mental age varies and the grade is constant the amount 
of difference in educational age must be attributed mainly to 
differences in mental age, if differences in educational age are 
caused chiefly by differences in mental age and the amount of 
training. When the grade location varies and the mental age 
is held constant, the amount of difference in educational age 
must be attributed mainly to differences in grade location, or 
the amount of training. By comparing the educational age 
difference ascribed to a two-year difference in mental age, with 
the educational age difference ascribed to a two-year difference 
in grade location, a fairly good measure of the relative influ- 
ence of mental age and the amount of training upon educa- 
tional age is obtained. 

In the spring of 1926 mental and educational tests were 
given to some 2,000 elementary school children of Hibbing, 
Minnesota, for the purpose of gathering data relative to a re- 
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classification of pupils. Both forms of the Otis Group Intelli- 
gence Examination and Form B of the Stanford Achievement 
Test were used. The testing was administered by the school 
examiner who had been specifically trained in such work and 
whose sole duties consisted of administering various types of 
tests. A corps of six junior college students scored all tests 
during the summer of 1926. These people were given a brief 
period of appropriate training by one of the writers. 

Not all the results of this testing were utilized in the 
present investigation. Only those pertaining to the pupils of 
the ‘‘A’’ or upper divisions of grades four to eight inclusive 
were employed. The data of the ‘‘B’’ divisions were not used 
because the number of children in each of these divisions was 
relatively small. The results for the children below grade 
four were not used because a different intelligence test, the 
Otis Primary Examination, was employed in these grades. 
There was also some elimination of the results of the children 
in the ‘‘A’’ divisions of grades four to eight, because in this 
investigation it was possible to use only the results of those 
children whose mental ages were the same as the menta! ages 
of the children who were located two grades higher or lower. 
In Table I is a statement of the total number of children in 
each grade, the number of children whose results were used 
and the number whose results could not be used. 

According to Table I there were 225 children in 4A and 372 
children in 6A. There were 220 of the 4A children who had 
the same mental age as 199 of the 6A children. In 4A there 
were three children with mental ages so low that they could 
not be matched with children of like mental ages in 6A. Two 
of the children in 4A were eliminated because they stood well 
above the remainder of the 4A distribution in mental age. In 
6A there were 173 children whose mental ages were higher 
than the mental ages of any children in 4A. The remainder 
of the table may be read in a similar manner. In 6A there 
were 117 children with mental ages which could be matched 
with like mental ages in both 4A and 8A. 
> 
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TABLE I 
Total Number of Children in Each of the Grades from Four to Eight; 
Also the Number of Children Whose Results Were Used in This 
Investigation and the Number Whose Results 
Were not Used 














| | 7 
GRADES | NUMBER OF NUMBER | NUMBER 
WHOSE TOTAL CHILDREN ELIMINATED ELIMINATED 
CHILDREN NUMBER WHOSE RE- AT LOWER AT UPPER 
WERE OF CHILDREN SULTS WERE END OF END OF 
COMPARED USED | DISTRIBUTION DISTRIBUTION 
4A 225 | 220 | 3 2 
i 
6A 372 199 0 173 
> | - | 
5A 384 354 29 1 
o © | | 
7A 154 146 0 8 
6A 372 289 | 82 
SA 130 107 0 23 
| 
Total 1637 1315 | 114 208 





The life ages of these children were obtained from the 
school records. The mental ages were determined from the 
results of Forms A and B of the Advanced Examination of 
the Otis Group Intelligence Seale, and the educational ages 
and subject ages were based on the results of Form B of the 
Stanford Achievement Scale. 

In 1926 Hibbing was a mining town of approximately 20,- 
000 inhabitants. The school district, covering, 225 square 
miles, included about 7,000 school children. These pupils rep- 
resented more than 25 different nationalities as determined by 
the birthplace of their parents. Practically all of the chil- 
dren, however, were born in America and speak the English 
language with average fluency. 


TREATMENT OF DATA 


A statement of the treatment of the data will be made in 
connection with the test results of the fifth and seventh 
grades. In the fifth grade the extreme mental age difference, 
excepting a few cases, was over six years, from about nine 
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years to over fifteen years. These mental ages, expressed in 
months, were grouped in steps of three. Thus the mental 
ages from nine years and two months to nine years and four 
months formed one group which was represented by a mental 
age of nine years and three months; the mental ages from 
nine years and five months to nine years and seven months 
formed another group which was taken at a mental age of nine 
years and six months and so on for the remaining mental ages. 
In this way twenty-eight groups were formed. 

For all of the children falling in the mental age group of 
nine years and three months, the average educational age was 
found. For all of the children falling in the mental age group 
of nine years and six months, the average educational age was 
computed, and so on for the remaining mental age groups. 
The average life and subject ages for each of the mental age 
groups were also computed. 

We now found the difference between the average educa- 
tional age of the nine years and three months mental age 
group and the average educational age of the eleven years and 
three months mental age group; a similar difference of the 
average educational ages was found for the nine years and six 
months and the eleven years and six months mental age 
groups, and so on for all of the pairs of groups differing two 
years in mental age for which there were corresponding men- 
tal age groups in the seventh grade. There were twenty such 
pairs. The unweighted average of these educational age dif- 
ferences was found to be 10.68 months, or 9.69 months when 
each difference was weighted by one-half of the number of 
children in the two mental age groups whose educational age 
difference was found. On the whole, there was little difference 
between the weighted and unweighted averages. Similar aver- 
ages were found for life age and each of the subject ages pro- 
vided for by the Stanford Achievement Seale. 

The data for grades four, six, seven and eight were treated 
in the same way as the treatment of the data of the fifth grade 
thus far described. This procedure yielded the averages of 
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the educational age, life age and subject age differences for a 
two years’ difference in mental age for each of the five grades 
involved in the study. The five average educational age dif- 
ferences were then averaged to obtain a single result. Similar 
averages were computed for life age and each of the subject 
ages. The treatment thus far described has given us the aver- 
ages of the differences in life age, educational age and subject 
ages when mental age is the variable and grade location has 
been held constant. 

Further computations were made to find the averages of 
the differences in life age, educational age, and subject ages 
when the grade location differed two years and mental age was 
the constant. Having found for the fifth and seventh grades 
the average educational age for each mental age group, the 
differences between the averages of the educational ages of the 
fifth and seventh grade groups, with a common mental age, 
were found. There were twenty-six such differences. The 
unweighted average of these differences was found to be 3.77, 
or 4.61 when each difference was weighted by one-half the 
number of children in the two groups whose educational age 
difference was found. By similar computations the average 
of the differences in life age and the averages of the differ- 
ences in the subject ages between the children of the fifth and 
the children of the seventh grade, with a common mental age, 
were found. 

To find the averages of the differences in life age, educa- 
tional age, and the subject ages, with mental age constant, for 
grades four and six and grades six and eight, the data of these 
grades were treated in the same manner as those for grades 
five and seven. The three averages of the educational age dif- 
ferences found for each pair of grades were then averaged to 
obtain a single expression of this difference. Similar averages 
were computed for life age and the subject ages. The results 
of these computations are expressed in Tables II, III, 1V and 
V. 
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A similar technique was used by M. J. Van Wagenen.* 
However, there are differences between this study and his. 
His study was limited to grades six and eight; the measure- 
ment in achievement was made in the subjects of spelling, 
reading, arithmetic, history, and geography, while this study 
included also the subjects of nature study and science, lan- 
guage, and literature; his results are not expressed in terms 
of educational and subject ages; and both the mental and edu- 
cational tests employed in the two studies were entirely dif- 
ferent. In spite of these differences the general results of the 
two studies are in close agreement. Moreover, Van Wagenen 
does not make any statement of the reliability of the differ- 
ences which he obtained. 


RESULTS 


In interpreting the results of this study the assumption is 
made that groups of children of the same mental age, if placed 
in equally favorable learning situations, have equal learning 
ability as determined by their average achievement in the sev- 
eral school subjects or by their educational age. 

All of the results are presented in Tables II, IlI, IV, and 
V. The first three of these tables are constructed on the same 
plan. The only difference is that they present the results of 
different pairs of grades. Table II gives the results for grades 
four and six; Table III for grades five and seven; and Table 
IV for grades six and eight. 

In the first column of each of these tables are given the 
initial letters of the different ages; the meanings of these let- 
ters are explained at the foot of each table. The second and 
third columns give, for each grade of the three pairs of 
grades, the averages of the differences in life age, educational 
age and the subject ages for the groups of children which dif- 
fer two years in mental age, but are in the same grade. The 


1Van Wagenen, M. J. ‘‘Grade Placement Versus Mental Age as a 
Factor in School Achievement.’’ Twenty-Seventh Yearbook of the Na- 
tional Society for the Study of Education, Chap. VI, 1928. 
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ralues in the fourth column of each table are the averages of 
the values in the second and third columns. In the last col- 
umn of each table the values are the averages of the differ- 
ences in life age, educational age and the subject ages for the 
groups of children with the same mental age, but with a two- 
year difference in grade location. 

The figures in the second column of Table V were found 
by averaging the figures in the second and third columns in 
Tables II, III, and IV for each age expressed in column one. 
Similar averages for the last columns in Tables II, III, and IV 
are given in the third column of Table V. The last half of 
Table V is a duplicate of the first half, with the exception of 
the fact that the averages in the first half are weighted as 
explained in the section on the treatment of the data, while 
the averages in the last half are unweighted. From an ex- 
amination of the table it will appear that the unweighted 
values are almost identical with the weighted values. 

Standard errors were found only for the averages of the 
total educational age differences and for the difference of 
these averages. These averages as given in Table V, row two, 
are 10.87 and 6.34. To find the standard error of the average 
of the educational age differences when mental age was held 
constant and the grade differed by two years, the values for 
all of the grades were thrown into a single distribution. 
Moreover, in finding this standard error N was taken as the 
number of groups, not the number of single individuals. The 
latter procedure would have reduced the size of the standard 
error considerably but it was not considered valid. The aver- 
age found by placing all of the values in a single distribution 
is 6.69, slightly larger than the corresponding values given in 
Table V. The standard error of this average is .95. 

A procedure similar to the one described in the above para- 
graph was used to find the average of the educational age dif- 
ferences when the grade was held constant and the mental 
age differed by two years. This average was found to be 10.89 


- 


with a standard error of .73. 
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The difference of the two averages, 10.89 and 6.69, is 4.20. 


The standard error of this difference is 1.2. 


) 7 


The obtained dif- 


ference is therefore more than 2.78 times the standard error 


of the difference, making our obtained difference statistically 
reliable. 
Even though there were no reliable difference between these 


averages, our results would still show that two years in mental 


age had as much influence upon educational achievement as 


two years of school training and life experience. 


in Mental Age but are in the Same Grade: 
ages of Life, Educational, and Subject Age 


TABLE II 
Averages of Life, Educational, and Subject Age Differences in Terms of 
Months for 220 4A and 199 6A Pupils who Differ Two Years 


Differences Between the 4A and 6B Pupils 
Who are of the Same Mental Age 





AVERAGES 
OF THE 
DIFFERENCES 
IN LIFE, 
EDUCATIONAL, 
AND SUBJECT 








AVERAGES 
OF THE 
DIFFERENCES 
IN LIFE, 
EDUCATIONAL, | 
AND SUBJECT 





AVERAGES 
OF THE 





Also Aver- 


AVERAGES 
OF THE 
DIFFERENCES 
IN LIFE, 
EDUCATIONAL, 
AND SUBJECT 














AGE&a | VALUES IN AGES BE- 
AGES FOR AGES FOR COLUMNS TWEEN 4A 
PUPILS OF 4A} PUPILSOF6A| TWO AND AND 6A 
WHO DIFFER WHO DIFFER THREE PUPILS WHO 
TWO YEARS TWO YEARS HAVE THE 
IN MENTAL IN MENTAL SAME MEN- 
AGE AGE TAL AGE 
hi Sek: —6.42 —4.73 -5.58 32.93 
se 12.26 7.25 9.76 | 6.33 
3.7. RA 15.94 | 13.20 | 1457 | 2.91 
| | 
ji 2A. So 11.12 | 8.73 9.93 11.23 
5. N.S. A 17.07 | 8.92 13.00 4.87 
GH: ii A 12.36 12.94 12.65 3.46 
9. Tas fi | 15.67 | 8.49 12.08 04 
8. S. A. 9.23 31 | 4.77 | 12.02 
— | | 
N | 920 | 199 419 | 419 
} | 


@ The age column reads as follows: 


(1) life age; (2) total educational 


age; (3) total reading age; (4) total arithmetic age; (5) nature study- 
science age; (6) history-literature age; (7) language age; (8) sperling 


age. 
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TABLD II! 

Averages of Life, Educational, and Subject Age Differences in Terms of 
Months for 354 5A and 146 7A Pupils who Differ Two Years in 
Mental Age but are in the Same Grade: Also Averages of 
Life, Educational, and Subject Age Differences Be- 
tween the 5A and 7A Pupils who are of the 
Same Mental Age 





AVERAGES 


AVERAGES 


IN LIFE, 


EDUCATIONAL, | 
AND SUBJECT 


EDUCATIONAL, | 
AND SUBJECT | 


AVERAGES 


IN LIFE, AVERAGES 
OF THE 


VALUES IN 


apoeta agernecterce OF THE 
OF THE OF THE | DIFFERENCES 
DIFFERENCES | DIFFERENCES | IN LIFE 


EDUCATIONAL, 
AND SUBJECT 








AGE& | AGES BE- 

AGES FOR. | AGES FOR COLUMNS | weEeEN 5A 
PUPILS OF 5A| PUPILS OF 7A TWO AND | AND 7A 

WHO DIFFER | WHO DIFFER THREE | PUPILS WHO 

TWO YEARS | TWO YEARS | HAVE THE 

IN MENTAL | IN MENTAL | SAME MEN- 

AGE AGE TAL AGE 
L. Ea 1.06 —1.56 =.25 29.39 
:. 2: EA. 9.69 11.48 10.59 4.61 
3. T. B.A. 13.36 15.83 14.50 2.71 
S. Ti ds Mi 9.38 11.49 10.44 —3.16 
S a em 9.91 11.60 10.76 7.25 
6. BO. ms A. 8.75 9.29 9.02 12.65 
oe 11.87 9.32 10.60 | 8.73 

| 
8.8. A 6.29 9.50 7.90 10.46 
| 
| 
SS } } 

N 354 | 146 | 500 500 


4a The age column reads as follows: (1) life age; (2) total educational 
age; (3) total reading age; (4) total arithmetic age; (5) nature study- 
science age; (6) history-literature age; (7) language age; (8) spelling 
age. 

Perhaps the best values in all of these tables are to be found 
in the first half of Table V. 
the children of the same mental age who are in the higher 


The figures in row one show that 


grades of our grade pairs are approximately two and one-half 
years older in life age than the children who belong in the 
lower grades of our grade pairs. This value is quite con- 
sistent for the individual grade pairs as may be seen in the 
other tables. However, children of the same grade who stand 
two years higher in mental age are about four months younger 


The only exception to this is found in grade five 


in life age. 
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TABLD IV 





Averages of Life, Educational, and Subject Age Differences in Terms of 
Months for 289 6A and 107 8A Pupils who Differ Two Years in 
Also Averages of 

Life, Educational, and Subject Age Differences Be- 

tween the 6A and 8A Pupils who are of the 


Mental Age but are in the Same Grade: 


Same Mental Age 

















‘ AVERAGES 
AVERAGES AVERAGES OF THE 
OF THE OF THE DIFFERENCES 
DIFFERENCES DIFFERENCES IN LIFE, 
IN LIFE, IN LIFE, AVERAGES EDUCATIONAL, 
EDUCATIONAL,| EDUCATIONAL, OF THE AND SUBJECT 
AGEa AND SUBJECT AND SUBJECT VALUES IN AGES BE- 
AGES FOR AGES FOR COLUMNS TWEEN 6A 
PUPILS OF 6A PUPILS OF 8A TWO AND AND 8A 
WHO DIFFER WHO DIFFER THREE PUPILS WHO 
TWO YEARS TWO YEARS HAVE THE 
IN MENTAL IN MENTAL SAME MEN- 
—_ acs TAL AGE 
ina | <oe -5.72 532 | 29.97 
a Ay ee ee 10.35 14.19 12.27 8.07 
3. T. R. A. 16.02 18.86 17.44 5.16 
4.T. A. A...| 12.38 17.78 15.06 | 4.90 
ee Se ae 13.03 11.96 | 12.50 10.33 
‘+ &. we A, <.. 9.41 13.33 11.37 13.14 
aa a od 14.52 13.59 14.06 9.32 
a eee 5.66 12.91 9.29 | 15.31 
ae 289 107 396 | (306 














* The age column reads as follows: (1) life age; (2) total educational 
age; (3) total reading age; (4) total arithmetic age; (5) nature study- 
science age; (6) history-literature age; (7) language age; (8) spelling 
age. 


in which the children who stand two years higher in mental 
age are about one month older. 

The two most significant numbers of Table V are those in 
row two. According to these values, if they are assumed to be 
true values, children of the same grade who excel in mental 
age by two years also excel in educational age by eleven 
months, but those children of the same mental age who excel 
two years in grade location excel in educational age by only 
six months. Thus the educational age difference is five months 
higher when the children are in the same grade but differ two 
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TABLE V 
Averages of Life, Educational, and Subject Age Differences in Terms of 
Months for 1815 4A to 8A Pupils who Differ Two Years in 
Mental Age with Grade Constant; Also Averages of 
Life, Educational, and Subject Age Differences Be- 
tween 4A and 6A, 5A and 7A, and 6A and 
8A Pupils with Mental Age Constant 


























Weighted Averages Unweighted Averages 
| AVERAGES AVERAGES 
AVERAGES OF THE AVERAGES OF THE 
OF THE DIFFERENCES OF THE DIFFERENCES 
DIFFERENCES IN LIFE, DIFFERENCES IN LIFE, 
IN LIFE, EDUCATIONAL, IN LIFE, EDUCATIONAL, 
g EDUCATIONAL, | AND SUBJECT | EDUCATIONAL, | AND SUBJ#CT 
AGE& AND SUBJECT AGES BE- AND SUJBEC' AGES BE- 
AGES FOR TWEEN 4A AGES FOR TWEEN 4A 
PUPILS OF 4A AND 6A, 5A PUPILS OF 4A AND GA, 5A 
TO SA WHO AND 7A, AND TO 8A WHO | AND 7A, AND 
DIFFER TWO 6A AND 8A DIFFER TWO | GAAND 8A 
YEARS IN PUPILS WITH YEARS IN | PUPILS WITH 
MENTAL AGE MENTAL AGE MENTAL AGE | MENTAL AGE 
CONSTANT | CONSTANT 
re ae & —3.72 30.76 | -—5.03 30.84 
2.7. EB A. 10.87 | 634 | 11.00 6.36 
. T. R.A. 15.54 | 3.59 15.59 3.41 
| we a = 
> Fee 11.81 4,32 10.73 3.17 
5N. 8. A 12.08 | 748 | 1220 | 7.35 
> = | ‘ € 
6. H. L. A | 11.01 9.75 | 11.63 9.38 
7. ss a 12.24 6.03 | 11.91 5.26 
> ‘ "OC ‘ | - oe é 
5. & ms 7.32 | 12.60 7.66 13.40 
N | 1315 1315 1315 | 1315 





* The age column reads as follows: (1) life age; (2) total educational 
age; (3) total reading age; (4) total arithmetic age; (5) nature study- 
science age; (6) history-literature age; (7) language age; (8) spelling 
age. 


years in mental age than when they differ two years in grade 
location but are of the same mental age. 

Not only do the children of the same grade who are mentally 
older by two years surpass, by eleven months in educational 
age, the children with whom they are compared, but they do 
this in spite of the fact that they are four months younger. 
Moreover, the children of the higher grade not only fall one 
year and one-half short of their expected superiority in edu- 
cational age over the children in the lower grade, but they do 
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so in spite of the fact that they are two and one-half years 
older and have thus had the advantage in life experience and 
probably in school training of one-half year more than they 
should be expected to have. 

Moreover, if spelling age, which correlates only slightly 
with intelligence, were eliminated from the total educational 
age, the educational age difference for grade constant with a 
two-year difference in mental age would exceed still more 
the educational age difference for mental age constant with 
a two-year difference in grade location. Spelling age is the 
only subject age which is raised more by two years’ addi- 
tional training than by a two-year excess in mental age. 

On the whole it seems that our educational offerings must 
be adapted to the child’s mental age or learning ability if the 
child is to profit very much by them. The results of experi- 
ments appear to support this contention. For example Gesell 
and Thompson? found that one and one-half weeks of train- 
ing in stair climbing, begun at the age of 53 weeks, yielded as 
high a degree of skill as five weeks of training begun at the 
age of 46 weeks. The subjects of these experiments were iden- 
tical twins,, who at the age of 46 weeks had equal locomotor 
ability. The daily training period lasted ten minutes. 

It appears that nature sets limits to the learning abilities 
of children, and that, when the demands of training exceed 
these limits, the children fail to make progress. Perhaps it 
is the prevalence of this condition in the public schools which 
has made it impossible to show a difference between so-called 
good teachers and poor teachers when their efficiency was 
measured by the results of achievement tests administered to 
the children. For example, in teaching a six-months-old child 
how to walk, we should expect a poor teacher to obtain about 
the same results as a good teacher. Also, a poor teacher 
may be expected to obtain as excellent results as a good teacher 

2 Gesell, Arnold, and Tohmpson, Helen. ‘‘Learning and Control in 
Identical Infant Twins.’’ Genetic Psychology Monographs, VI: Ne. 1, 
1929. 
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by devoting time to the development of abilities which the 
child has already acquired. 

This investigation has some shortcomings, but we believe 
them to be of minor importance. Not all of the factors which 
have an influence upon achievement are included in the study. 
However, the results of similar studies indicate that the two 
with by far the most influence have been included. If a num- 
ber of other factors had been included, their proper control 
would have been exceedingly difficult or altogether impossible. 
A better technique or method of treating the data might have 
been employed, but this method has the advantage of sim- 
plicity. A knowledge of difficult statistical methods is not 
necessary to comprehend the findings. Perhaps one of the 
most serious deficits is the nature of the sample. There should 
have been more cases and these should have been carefully 
selected. These things were not done because such data were 
taken as were used for other purposes and were then available 
for this study. With all of the possible shortcomings, the 
general results are in line with those of other investigations 
of a similar nature. 

How shall we explain the fact that children who are in the 
same grade but differ two years in mental age, show a much 
larger educational age difference than children who are of the 
same mental age but differ two years in grade location? 
{vidently there must be a closer relation between mental age 
and educational age than between amount of training and 
educational age. The relation between mental age and edu- 
cational age may be explained in several ways. First, that 
to do both the mental and educational tests is primarily a 
matter of training; and, second, that to do both of them is 
primarily a matter of learning capacity, training playing a 
secondary role. Now, what do our results show? According 
to Table I, they show that out of 384 fifth grade children all 
but 29 had mental ages which were also represented in the 
seventh grade in spite of the fact that these children were 
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two and one-half years younger and had two years less train- 
ing; on the other hand out of 154 seventh grade children all 
but 8 had mental ages which were also represented in the fifth 
grade in spite of the fact that these children were two and one- 
half years older and had received two years more training. 
It appears from these facts that such differences in training 
and life experience as exist today can have very little influence 
upon differences in mental age or differences in performance 
on an intelligence test. It is on account of such facts that we 
feel obliged to explain differences in mental age primarily 
upon the basis of differences in learning capacity. 

Table V shows that when mental age is constant, the differ- 
ence in training is two years, and the difference in life experi- 
ence is two and one-half years, there is only a difference of 
six months in educational age in place of the expected twenty- 
four. On the basis of this fact it seems fair to assume that 
differences in educational age as they exist in our schools | 


today are primarily due to differences in learning capacity. 
These two paragraphs appear to us to offer evidence of the ; 
fact that the closer connection between mental age and educa- 


tional age than between mental age and the amount of training 
is to be ascribed more to differences in native capacity than to 
differences in the amount of training. 1 

If we can accept the conclusion that differences in educa- ‘ 
tional age or achievement are due more to differences in learn- : 


ing capacity than to differences in the amount of training, é 
then this fact should be recognized in certain types of eugenic 
measures, in worth-while educational and vocational guidance . 


programs and in an improved type of school organization. 
As we all know promotion in our schools and graduation from 
our higher institutions of learning are far more dependent 
upon the length of time served than upon the amount accom- 
plished. A practice which evidently grew out of the belief 
that the amount of time spent in training was all important 
and learning capacity of slight significance. We also know 
that our schools are attempting to administer the same cur- 
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riculum to all of the children in heterogeneous groups, regard- 
less of whether they have the capacity to acquire it or to make 
use of it subsequent to its acquisition. 


SUMMARY STATEMENT 

The purpose of this investigation was to determine the 
relative importance of intelligence and length of schoo] train- 
ing as factors influencing educational achievement. Intelli- 
gence was defined as that capacity measured by the Otis Group 
Intelligence Seale. Length of training was represented by 
grade location in school. Educational achievement was de- 
fined as those abilities measured by the Stanford Achievement 
Test. 

The method of investigation made possible a comparison 
between the educational ages of children in the same grade 
who differ two years in mental age with the educational ages 
of children of the same mental age who differ two years in 
grade location. Apparently any differences in the educational 
ages of pupils who have had the same amount of school train- 
ing are primarily due to differences in mental age. Likewise 
any differences in the educational ages of pupils who possess 
the same amount of intelligence may apparently be attributed 
to differences in amount of training. Thus by comparing the 
educational age difference when the amount of training is held 
constant with the educational age difference when the mental 
age is held constant, a reasonable measure of the relative in- 
fluence of the two factors upon edrecational achievement is 
obtainable. 

Such comparisons were made in grades 4 to 8 inclusive. In 
determining the influence of mental age, the pupils of each 
grade were placed in several groups according to mental age. 
Each group included three months in mental age. For each 
of these groups the average educational age was computed. 
The difference in educational ages between every possible two- 
year space in mental age was determined. For example, the 
educational age of the nine years and three months old group 
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was compared with that of the eleven years and three months 
old group; the educational age of the nine years and six 
months old group was compared with that of the eleven years 
and six months old group, and so on. The educational age 
differences thus found were averaged for each grade and the 
results expressed in Tables II, III and IV. These averages 
for the different grades were then averaged and the results 
expressed in Table V. As shown in Table V, children who 
differ two years in mental age but are in the same grade have 
an educational age difference of 10.87 months. 

In determining the influence of the length of training, the 
educational ages of the same mental age groups in two differ- 
ent grades two years apart were compared. These compari- 
sons were made between grades four and six, five and seven, 
and six and eight. The educational age differences obtained 
in this manner were averaged for each pair of grades and the 
results expressed in Tables II, Il] and 1V. These averages 
for the different pairs of grades were then averaged and the 
results expressed in Table V. Here it may be seen that chil- 
dren who are of the same mental age but differ two years in 
grade location have an educational age difference of 6.34 
months. Therefore, children who are in the same grade but 
differ two years in mental age exceed by 4.53 months the chil- 
dren who have the same mental age but differ two years in the 
amount of training. 

The results of the investigation show that educational 
achievement is determined more by intelligence than by length 
of school training. This is evidenced not only in terms of 
total educational age but also in terms of each subject mea- 
sured by the Stanford Achievement Test, except spelling. 

Tentative conclusions pointing to the superior influence of 
intelligence upon educational achievement should be utilized. 
It is probable that educational guidance, types of school 
organization which enable pupils to work to capacity, and the 
establishment of valid criteria for promotion, are important 
and imperative. 








INFLUENCE OF COLOR ON LEGIBILITY OF COPY 


F. C. SUMNER 


Howard University 


The purpose of the work here reported was threefold: 
firstly, to verify early experimental results on legibility of col- 
ored letters on colored backgrounds, the origin of which is 
somewhat shrouded in hearsay but which Luckiesh' traces as 
far back as a quotation from Le Courrier du Livre, which was 
printed in Scientific American Supplement, Feb. 2, 1913. 
These results will be for convenience sake referred to here- 
after as those of Luckiesh. 

In the second place it is proposed to enlarge the number of 
color-combinations from 13 to 42. 

These forty-two combinations are obtained from seven 
colors, namely, Red, Yellow, Blue, Green, White, Black and 
Gray, by making every possible different combination of let- 
tering and background out of these colors. The original colors 
here used were of medium chroma and tint equivalent to De 
Voe’s standard showeard colors. 

The lettering consisting of digits and letters (6 units to a 
cardboard) were stenciled with De Voe’s showeard colors. 

Five subjects were used, one at a time and only on uni- 
formly clear days. Measurements were made of the maximum 
distance at which legibility of copy in its entirety was pos- 
sible. In this regard the procedure was not unlike that used 
in visual acuity tests save that the subject came forward until 
he could just read correctly the characters. The distance- 
records for the 42 combinations were ranked by subjects and 
average legibility rankings for various combinations were then 
ranked. The experiment was an outdoor one, owing to dis- 
tance requirements. 

1‘**Color and Its Application. ’’ 
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In the third place, it was proposed to ascertain the relation- 
ship of legibility of color combinations and their affective 
preference. For this latter phase of the experiment the same 
five subjects were later required to rank according to the order 
of preference the 42 color combinations. 


FINDINGS 


Table I presents the order of average rankings for the five 
subjects as to the legibility of the color combinations and also 
the respective rankings of the average rankings as to affective 
preference. 

CONCLUSIONS 


1. The law of the legibility of colored lettering on colored 
backgrounds discovered by the early investigators and re- 
sumed by Luckiesh and Poffenberger, namely, that legibility 
depends on brightness-difference between color of lettering 
and that of background, appears substantiated. 

2. A second law of legibility is discernible, namely, that 
dark colored lettering on a light colored background is more 
legible than the reverse in daylight. 

3. In the present investigation it is found that gray forms 
the best background for the legibility of colored lettering. 

4. Comparing the present results on legibility with those 
reported by Luckiesh for a considerably smaller number of 
color combinations, it is found that the correspondence is a 
fairly high positive one (amounting to .46) despite the fact 
that it is impossible to be sure of exact color standards in- 
volved in the investigation reported by Luckiesh. (See Table 
II.) 

5. Many more or less uncontrollable difficulties beset an in- 
vestigation of legibility conducted as in the present instance, 
namely: (a) interference of negative after-images, especially 
after a prolonged attempt to read at great distance; (b) the 
phenomenon of irradiation; (¢) the fact that one unit of the 
six characters may prove unduly misleading and that such 
illusions were dispelled by the subject with difficulty; (d) 
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TABLE 
l 
AFFECTIVE 
CASE NO. BACKGROUND LETTERING : se PREVERERCE 

— RANK 

1 Gray Blue | 1.0 1.0 
2 Gray Black 2.0 19.0 
Rat Yellow Black 3.0 | 6.0 
4 Gray Red 4.0 3.0 
5 Yellow Gray 5.0 20.0 
6 Gray Green | 6.0 9.0 
7 Yellow Blue 7.5 7.0 
8 White Gray 7.5 | 23.0 
9 Gray Yellow 9.0 14.0 
10 Yellow Red 10.5 10.0 
11 White Black 10.5 16.0 
12 Yellow Green 12.0 18.0 
is | Red Green | 13.0 30.5 
mes Gray White 14.0 11.5 
15 Green Blue 15.0 24.0 
16 Green Black 16.5 36.0 
17 | Green Gray 16.5 27.0 
18 White Green 18.0 15.0 
i9 =} Blue Green 19.0 25.0 
20 Black Green 21.0 37.0 
2] Red Gray 21.0 26.0 
22 Blue Yellow 21.0 17.0 
23. Ci Green Yellow | 23.0 29.0 
24 White Red 24.5 8.0 
25 COI White Blue 24.5 2.0 
26 CO Red Black 26.0 42.0 
27) Black Yellow 27.0 5.0 
28 Black Gray 28.0 32.0 
29 Blue White 29.5 13.0 
30 Blue Gray 29.5 11.5 
31 ted slue 31.0 33.0 
32 | Red Yellow 32.0 28.0 
33 | Blue Red 33.0 34.5 
34 Red White 34.0 22.0 
35 Black White 35.0 4.0 
36 | Green White 36.0 21.0 
37 Yellow White 37.0 34.5 
38 | Green Red | 38.0 30.5 
39 Black Red 39.0 38.0 
40 | Blue Black | 40.0 41.0 
41 | White Yellow | 41.0 40.0 
42 Black | Blue | 42.0 39.0 


variability in legibility of individual characters; (e) the ad- 
just ment of the rest-interval between readings (a fixed inter- 
val while proving sufficient in some eases for the clearance of 
negative after-images and adaptation- and accommodation- 
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TABLE II 


























. _—— : | J ORDER-INDEX RANK ORDER 
ORDER OF VALUE PRINTED BAC K- (PRESENT (PRESENT 
(LUCKIESH) MATTER | GROUND STUDY) STUDY) 

sciences i 
1 (greatest) | Black | Yellow 3.0 1.0 
2 Green | White 18.0 5.0 
3 | Red White 24.5 6.5 
4 Blue White 24.5 6.5 
5 White Blue 29.5 9.0 
6 Black White | 10.5 2.5 
7 Yellow Black 27.0 8.0 
8 White Red 34.0 10.0 
9 White Green 36.0 12.0 
10 White Black | 35.0 11.0 
11 Red Yellow 10.5 2.5 
12 Green Red 13.0 4.0 
13 | Red Green 38.0 13.0 
rho .46 


effects may not prove adequate in other cases where a lengthy 
exposure is necessary for reading) ; (f) differences in legibil- 
ity-distances for some color-combinations are so slight that 
ranking of them may show considerable variation for different 
subjects. (This bunching of legibility-distances is most pro- 
nounced for those combinations which lie intermediate be- 
tween extremes in the final legibility-ranking (see Table 1)) ; 
(g) mind-set or attitude of the subject undoubtedly exerted 
some effect on results. (A side-experiment in which three 
subjects were tested together showed unmistakably the influ- 
ence of the competitive attitude on reading at great dis- 
tances. ) 

6. Legibility and affective preference of color-combinations 
show a fairly high positive correlation (rho .54). 

7. Affective preference of the color-combinations obeys the 
law of brightness-difference more strikingly than does legi- 
bility. 

8. Affective preference of color-combinations depends more 
on brightness-difference than on legibility. 








A STUDY OF THE RELATION OF BRIGHTNESS TO 
STANFORD-BINET TEST PERFORMANCE* 
RUTH E. PERKINS 


Unwersity of Chicago 


The purpose of this investigation was to study the influence 
of experience in answering questions of the Stanford-Binet 
tests. Given two groups of children with the same mental 
age but different chronological ages, would questions be found 
which the older retarded children could answer and the 
younger accelerated children could not? Significant differ- 
ences of this sort would indicate that added experience rather 
than superior endowment is the factor largely operative in 
answering certain tests of the Stanford-Binet scale. 


METHOD 

The study was made on 221 Stanford-Binet records. These 
had been given and scored by experienced examiners, and 
were culled from several thousand tests taken from three 
sources: (1) The Institute for Juvenile Research, Chicago, 
Illinois, (2) The Elementary School of the University of 
Chieago, (3) The Skokie School, Winnetka, Illinois. All 
mental age and IQ calculations of the tests used were re- 
checked by the writer. 

* This article is a part of the thesis submitted by the writer to the 
graduate faculty of the University of Chicago in candidacy for the de 
gree of Master of Arts, December 1930. The study was made under the 
direction of Dr. A. W. Brown, of the Institute for Juvenile Research, 
Chicago, Illinois. Grateful acknowledgments are made to Dr. F. A. 
Kingsbury, of the University of Chicago, for criticisms and suggestions, 
Mr. H. O. Gillet, Principal of the University Elementary School of the 
University of Chicago, and Miss Funk, Secretary of the Department of 
Educational Counsel, Winnetka, Ill., for permission to use Stanford-Binet 
records. 

Studies from the Institute for Juvenile Research; Paul L. Schroeder, 
M.D., Director, Series C, No. 172. 
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The records were selected so as to make two groups of chil- 
dren, each homogeneous as to mental age: Group A, children 
with mental ages of 8 years to 8 years, 6 months inclusive; 
group B, children with mental ages of 11 years to 11 years, 
6 months inclusive. Chronological ages ranged from 17 years 
to 5 years, 9 months in group A, and from 17 years 10 months 
to 7 years 8 months in group B. Each of these two mental 
age groups was divided into three sections on the basis of 
brightness: 1Q’s from 0 to 89 were classed as retarded; 90 to 
110 IQ normal; 111 IQ and above, superior. In Group A 
there were 51 retarded, 21 normal, and 30 superior children, 
and in Group B there were 52 retarded, 37 normal, and 30 
superior children. 

A tabulation was made showing for each child the tests 
passed ; and from this was computed the frequency with which 
the different brightness groups passed each test. Since the 
number of children in each group varied, results for each 
separate test were expressed as percentages of each class 
passing that test. The reliability of the differences between 
the percentages of the retarded and superior children passing 
a given test was found by the formula cited by Holzinger' for 
the P. E. of a percentage, for the P. E. of the difference be- 
tween two percentages, and for the reliability of this differ- 
ence. 


RESULTS 


The results are shown in Tables I and II. Table I shows 
the number and percentages of all the children in each mental 
age group who passed each test. For instance, in Group A, 
year VI, Test I (points to parts of head), 48 children, or 94 
per cent of the total number of retarded children in this 
group, passed the test. Twenty-one children, or 100 per cent 
of the normal children of this group, passed; and 29, or 97 
per cent of the total number of superior children, passed. 


1 Holzinger. Statistical Methods for Students in Education, Ginn and 
Co., 1928. 
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Table II shows the differences in percentages of retarded 
and superior children passing the tests, and the reliability of 
these differences. There are ten tests in Group A and some 


TABLE I 
Number and Percentage of all Children Passing Tests 


GROUP A—M.A. 8 YRS.—8 YRS. 6 MO. 


Retarded Normal Superior 


0-89 1Q 90-110 IQ 111-140 19 
No. pan No. oan No. pia 
VI 
1. Point to parts of head 48 94 21 100 29 97 
2. Missing parts 51 100 20 95 28 893 
3. Counts pennies 51 100 21 100 30 100 
4. Comprehension 50 98 20 95 30 100 
5. Recognizes coins 51 100 21 100 30 100 
6. Repeats sentences 51 100 21 100 30 100 
Vil 
1. Fingers 50 98 21 100 29 «8697 
2. Picture description 48 94 17—s 81 30 100 
3. Digits forward 43 8&4 15 72 28 93 
4. Bow Knot 44 86 Mm 6UTF 28 93 
5. Differences 49 96 21 100 29 «=97 
6. Copies diamond 41 80 19 90 25 3883 
Vill 
1. Ball & Field 29 #258 4 19 15 50 
2. Counting backward 44 86 20 95 22 73 
3. Comprehension 38 74 16 77 28 93 
4. Similarities 27 8652 16 77 28 93 
5. Definitions 39 76 19 90 24 80 
6. Vocabulary—20 words 23 «45 7 36 13 43 
IX 
1. Date 3: 64 12 54 11 36 
2. Weights 24 47 12 «8657 17 +56 
3. Makes change 33 —Ss«é64 15 68 8 26 
4. Digits backward (4) 15 29 8 38 10 «+33 
5. Makes sentences 31 = 60 13—Ss«@61 12 40 
6. Rhymes 22 43 12 57 18 60 
xX 
1. Vocabulary—30 words 1 2 0 0 1 3 
2. Absurdities S: 2 7 $33 10 33 
3. Designs 16 =631 7 33 6 20 
4. Reading & Report . = 1 4 0 0 
5. Comprehension 5 10 ee os) 2 6 
6. 60 words 14 27 8 38 60 
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TABLE I—Continued 





GROUP A—M.A. 8 YRS.—8 YRS. 6 MO, 











Retarded Normal Superior 
0-89 IQ 90-110 IQ 111-140 IQ 
No om No. oot No. an 
XII 
1. Vocabulary—40 words 
2. Define abstract words 
3. Ball & Field 2 + 1 4 1 3 
4. Dissected sentences 
5. Fables 
6. Digits backward 0 0 1 4 1 3 
7. Picture interpretation 
8. Similarities 1 2 0 0 0 0 
XIV 
. Voeabulary—50 words 
2. Induction test 
3. Differences (Pres. & King) 
4. Problems of fact 
5. Arithmetic 
6. Clock 
XVI 
2. Fables 
4. Enclosed boxes 
5. Digits backward (6) 
6. Code 
XVIII 
2. Paper cutting test 
3. Digits forward 
5. Digits backward 


17 tests in group B which are passed by a higher percentage 
of retarded than of superior children of the same mental age. 
In the case of Test [X-3 (makes change) Group A, which is 
passed by a higher percentage of retarded than superior chil- 
dren, the difference between the two percentages is 38, and 
this difference divided by its probable error gives a quotient 
of 5.29. This difference then is highly reliable. 

To illustrate further, there are about 14 tests in Group A 
and 9 tests in Group B which are passed by a higher per- 
centage of superior than retarded children of the same mental 
age. One such test in Group A is VIII-4 (similarities) in 
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TABLE I—Continued 





GROUP B—M.A. 11 YRs.—11 Yrs. 6 MO. 











Retarded Normal Superior 
0-89 IQ 90-110 IQ 111-140 IQ 
ar oe Io 
No. = No. jaf No. Lat 
VII 
1. Fingers 
2. Picture description 
3. Digits forward 
4. Bow Knot 52 100 37 ~=—«:100 29 97 
5. Differences 
6. Copies diamond 52 100 36 «698 30 100 
VIII 
1. Ball & Field 52 100 35 94 26 66 
2. Counting backward 
3. Comprehension .. 52 100 37 ©6100 29 48697 
4. Similarities 52 100 37 100 30 ©6100 
5. Definitions 
6. Vocabulary—20 words 
IX 
1. Date 51 98 35 94 25 83 
2. Weights 49 94 33-89 24 «80 
3. Makes change 51 98 37 100 26 =: 886 
4. Digits backward (4) 50 38696 31 83 24 80 
5. Makes sentences 52 100 37 100 27 690 
6. Rhymes 49 94 35 «(94 27 «= 90 
xX 
1. Vocabulary—30 words 41 78 32 86 19 63 
2. Absurdities 45 86 35 8694 27 90 
3. Designs 39 75 31 83 28 93 
4. Reading & Report 43 83 32 86 24 3880 
5. Comprehension 47 890 31 = 83 24 80 
6. 60 words 42 80 30 = «81 24 80 


which the difference between percentages of superior and re- 
tarded children passing the test is 41, and this difference 
divided by its probable error equals 7.35. This, again is a 
very reliable difference. 

When fewer than 5 per cent of the children, either retarded 
or superior, passed a test the numbers were so small as to be 
highly unreliable, and hence these tests were omitted from 
Table II. The practical difficulty of securing a larger num- 
ber of cases which would meet the requirements of the present 
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TABLE I—Continued 








XII 
Voeabulary—40 words 
Define abstract words 
Ball & Field 
Dissected sentences 
Fables 

Digits backward 
Picture interpretation 
Similarities 


XIV 
Vocabulary—50 words 
Induction test 


Differences (Pres. & King) : 


Problems of fact 


Arithmetic 
Clock . 

XVI 
Fables 


Enclosed boxes 
Digits backward (6) 
Re 


XVIII 
Paper cutting test 
Digits forward. ............ 
Digits backward 


GROUP B—M.A. 11 yRS.—11 yrs. 6 MO. 














Retarded Normal Superior 
0-89 IQ 90-110 IQ 111-140 IQ 
. Per . Per P 
No. cent No. cent No. cent 
10 19 9 24 7 23 
22 42 12 32 ll 36 
23 «44 22 «59 16 53 
18 34 18 48 21 70 
31 59 20 «54 22 73 
16 §=30 15 40 18 60 
370 71 24 8664 14 46 
32 «61 27 «73 26 = 86 
0 0 0 1 3 
19 36 15 40 7 23 
6 11 2 5 0 0 
20 «638 5 13 3 610 
2 3 4 10 1 3 
16 «30 5 13 ll 36 
2 3 5 13 2 6 
9 17 5 13 5 16 
2 3 2 5 4 13 
2 3 0 0 0 0 
3 5 1 2 3 10 
0 0 0 0 1 3 
00 0 0 0 1 3 








study made it impossible to obtain greater statistical re- 


liability in all cases. 


The fact that several of the tests in 


both groups A and B (IX-1, XII-7, VIII-4, XIV-3 and 
others are identical with those listed by Miss Merrill’ as being 
easiest for retarded or hardest for superior children, indicates 
that the present results are not due to chance. 
Of the ten tests passed by a larger number of retarded than 
superior children in Group A, there were three: 1X—1, [X-3, 


2Maud A. Merrill, ‘‘On the Relation of Intelligence to Achievement 
in the Case of Mentally Retarded Children,’’ Comparative Psychelogy 
Monographs 1923-25. 
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and X-6 in which the differences in percentages approach 
statistical reliability. In Group B these same three tests are 
again passed by a larger percentage of retarded children 
although the differences are smaller and less reliable. 

In Group B, two of the 17 tests which the retarded children 
pass more readily than do the superior children show a very 
high reliability. These are XIV-4 (problems of fact) and 
XII-7 (picture interpretation). Just why these tests should 
so stand out is rather puzzling, since, although the subject 


TABLE II 
Differences in Percentages of Children in Retarded and Superior Group 
who Passed Tests and Reliability of these Differences 


GROUP A 


Tests Passed by Larger Percentage of Diff. in ee... a 
Retarded Than of Superior Children Per cent P.E. gift. 
IX—3 Makes Change 38 5.29 
IX—1 Date 28 3.74 
X—6 60 words 17 3.06 
VIII—2 Counting backward 13 2.02 
X—3 Designs 12 1.84 
VI—2 Missing Parts 7 1.75 
VIII—1 Ball & Field 8 1.08 
X—5 Comprehension 90 
IX—5 Makes sentences 2 27 
VIII—6 Vocabulary 1 13 
Tests Passed by Larger Percentage of 
Superior Than of Retarded Children 
VIII—4 Similarities 41 7.35 
X—2 Absurdities 22 6.4 
VIII—3 Comprehension 19 3.65 
VII—2 Picture Description 6 2.52 
IX—6 Rhymes 18 2.38 
VII—3 Digits forward 9 2.24 
VII—4 Bow Knot 7 1.55 
IX—2 Weights 10 1.31 
VI—1 Points to hand, ear & eye 3 .93 
IX—4 Digits backward 6 84 
VIII—5 Definitions 4 63 
VI—4 Comprehension 2 49 
VII—5 Differences 1 36 
VII—6 Copies diamond 3 50 
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TABLE II—Continued 





























GROUP B 
Diff. 
Tests Passed by Larger Percentage of Diff. in 
Retarded Than of Superior Children Per cent P.E. giff. 
XIV—4 Problems of fact 28 5.23 
XII—7 Picture interpretation 25 3.36 
VIII—1 Ball & Field . 14 3.21 
i | ene 15 3.17 
IX—4 Digits backward 16 3.07 
IX—3 Makes change ..... 12 2.70 
IX—5 Makes sentences = 10 2.64 
IX—2 Weights ...... ate 14 2.61 
X—1 Vocabulary—30 words ....... 15 2.28 
XIV—2 Induction test a 13 1.90 
X—5 Comprehension ..... 10 1.78 
VIII—3 Comprehension 4 1.47 
VII—4 Bow Knot ....... 3 1,21 
IX—6 Rhymes 4 93 
XII—2 Define Abstract Words 6 .80 
X—4 Reading & report ..... 3 50 
XVI—4 Enclosed boxes 1 Bj 
Tests Passed by Larger Percentage of 
Superior Than of Retarded Children 
XII—4 Dissected sentences ' 36 5.06 
XII—8 Similarities ............. is 25 4.30 
XII—6 Digits backward 30 4.09 
X—3 Designs .... 18 3.54 
XII—5 Fables 14 2.00 
XII—3 Ball & Field 9 1.17 
XIV—6 Clock 6 82 
X—2 Absurdities 4 82 
XII—1 Vocabulary—40 words 4 63 





matter deals with concrete facts within the experience of older 
children, the reasoning ability involved is supposedly one of 
the higher intellectual functions. It seems that although a 
bright younger child may possess more highly developed 
powers of abstract reasoning than does the older+ retarded 
child, he is not necessarily able to cope with a problem involv- 
ing concrete reasoning with which he has not come into con- 
tact in his daily life. Test XII-7 Picture Interpretation is 
listed by Miss Merrill® as easiest for retarded children, which 
3 Maud A. Merrill. Op. cit. 
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agrees with the present results, but she does not mention test 
XIV-4, (Problems of Fact). 

There are four tests in Groups A and B which are passed 
by none of the bright children: X-4, XII-8, XIV-3, and 
XVI-6. It seems probable that the failure of the younger 
brighter children to pass X—4 is due in many eases to their 
inability to read the passage involved. The same thing prob- 
ably holds true for this test in Group B, where there is still a 
slight tendency for the retarded children to excel the bright 
ones in this test. 

The only child in Group A who passed test XII-8 was a 
retarded child, IQ 65. In Group B the bright children 
passed this test more readily than did the retarded children, 
which appears to support the assumption that tests of abstract 
thinking are passed more readily by bright children. 

Since 6 retarded and 5 normal children passed test XIV-3, 
it seems logical to infer that there is little in the innate intel- 
ligence, alone, of a child which will lead him to ferret out the 
(3) three differences between a president and a king which 
are required of him in this instance. Performance on this 
test seems to be influenced very decidedly by the amount of 
schooling the child has had. The findings in this case are in 
agreement with Miss Merrill, who lists the test as hardest for 
superior children. 

Of the tests in both groups A and B which are passed by 
higher percentages of superior than retarded children, there 
are five which show very reliable differences. These are: 
Absurdities X—2, Similarities VIII4 and XII-8, Dissected 
Sentences XII—4, and Digits Backward XII-6. All of these 
tests are of the type which require the so-called ‘‘higher 
thought processes’’ for their solution. 


INTERPRETATION OF RESULTS 
When we consider the four outstanding tests, IX—1, 3, 
XII-7, and XIV-—4, which are passed by higher percentages 
of retarded than superior children, we find that they are tests 
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demanding concrete rather than abstract thinking. The 
younger, brighter child plays with new and interesting words, 
with ideas and abstractions which carry him out of the world 
of concrete reality. The older retarded child, on the other 
hand, seems to grasp best those concepts which are plainly 
outlined for him in real every-day life. Again, performances 
on these tests may be accounted for by the fact that the older 
child has had, because of longer life, more chance to pick up 
information here and there, and has been taught facts which 
are considered too advanced or difficult for the younger child, 
even though the latter may be mentally superior. The ‘‘social 
status’’ of the family also influences the child’s ‘‘experience,’’ 
although Terman‘ brings out the fact that social status as 
measured by the family’s economic status and neighborhood 
environment may be in contrast to the family’s intellectual 
standing. 

Cyril Burt’ in his investigation into the influence of social 
status on the test performance of London school children con- 
cluded that the tests easiest for children from ‘‘superior’’ 
homes were: tests requiring linguistic facility, scholastic tests, 
memory tests for repetition of sentences, and tests depending 
on items of information imparted during early life in a eul- 
tured home. Those tests easier for children from poorer 
homes were: tests depending upon familiarity with money, 
tests which are perceptual rather than conceptual, ‘‘ practical 
tests,’’ and those depending upon critical shrewdness. Of 
the four tests which are passed by higher percentages of re- 
tarded than superior children in the present study, [X-1, 3, 
and XIV-—4 seem to fall under Burt’s classification of tests 
easiest for children of poor social status, but he lists XII-—7 
under tests easier for children from superior homes. It is of 
interest to note that all but one of the tests (XII-7, Picture 
Interpretation) which are listed by Miss Merrill as being 


4 Terman, Intelligence of School Children. Houghton Mifflin Co. 1919. 
5 Burt, Mental and Scholastic Tests. P. 8. King and Son, Ltd., Lon- 
don, 1921. 


STANFORD-BINET TEST 215 


easier for retarded children are also listed by Burt as easier 
for children of poor social status. 


CONCLUSIONS 

1. The present study agrees with the work of other investi- 
gators who maintain that factors other than native intelli- 
gence influence performance on the Stanford-Binet Tests. 

2. Tests I1X-3 Makes Change, I[X—1 Date, XII-7 Picture 
Interpretation, and XIV-4 Problems of Fact, are definitely 
‘‘experience’’ tests. That is, the ability to pass these tests is 
based on experience rather than on innate intelligence. 

3. Tests VIII-3 Comprehension, VIII-4 and XII-8 Simi- 
larities, X—2 Absurdities, X-3 Designs, XII-—4 Dissected Sen- 
tences, and XII-—6 Digits Backward, are tests of brightness. 
The ability to pass these tests depends to a large degree on 
superior intellectual endowment. 

4. It is helpful to the clinical psychologist and to others 
who make frequent use of the Stanford-Binet scale to know 
just which tests of the series are definitely ‘‘experience’’ tests, 
and which ones are much less dependent on experience. 
Knowing this, it is possible, after testing a child suspected of 
being mentally retarded, to interpret his performance on such 
tests in terms of the probable influence of his past experience 
even though one may not be familiar with all of his individual 
history. 
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THE RELIABILITY OF THE GOODENOUGH INTELLI- 
GENCE TEST USED WITH SUB-NORMAL CHIL- 
DREN FOURTEEN YEARS OF AGE 


EDNA WILLIS McELWEE 

Ungraded Classes, New York City 

When Dr. Florence L. Goodenough established the norms for her 

intelligence test based on the measurement of children’s drawings of a 
man, she examined children from four to twelve years of age. 

In her instructions for scoring she says: ‘‘In finding the IQ’s of 

retarded children who are more than thirteen years old, the chrono- 

logical age should be treated as thirteen only, and the IQ recorded as 


‘ ) 


. or below.’ 
The purpose of this study was to find out whether her norms were 
equally reliable when the intelligence test was given to sub-normal chil- 
dren over twelve years of age, the comparison being made in terms of 
mental age rather than IQ. 

During the past school year the writer reexamined forty-five of the 
children in the ungraded classes of New York City who were fourteen 
years of age. The Goodenough Intelligence Test was given at the same 
time as the Stanford-Binet Examination. 

The drawings made by these fourteen-year-old sub-normal children 
were quite varied. There was almost an unbroken gradation from the 
primitive circle to the higher types. 


TABLE I 
Comparison of the Mental Ages on the Goodenough Intelligence Test 
and the Stanford-Binet Examination 


MENTAL AGE | BINET GOODENOUGH 
4-0 to 4-11 ' 0 | 3 
5-0 to 5-11 a 0 6 
6-0 to 6-11 a 5 | 11 
7-0 to 7-11 “a 15 7 
8-0 to 8-11 | 18 7 
9-0 to 9-11 + 5 5 
10-0 to 10-11 4 2 4 
11-0 to 11-11 . 0 | 2 
Median Mental Age........ | 8-0 7-3 
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The mental age range on the Stanford-Binet Examination is from 
6-0 to 10-11; on the Goodenough Intelligence Test from 4-0 to 11-11, 
being three years greater on the latter. The median mental age on the 
Binet is 8-0 and on the Goodenough 7-3, being nine months lower on 
the latter. 


TABLE II 
Variation in Mental Age on the Goodenough Intelligence Test and the 
Stanford-Binet Examination 





Number cases with lower mental age on Goodenough . 26 
Number cases with higher mental age on Goodenough 15 
Number cases with same mental age on both tests 4 
Total number months lower on Goodenough 463 
Total number months higher on Goodenough 212 
Average number of months lower on Goodenough 5.5 





The actual number of cases where the mental age was lower on the 
Goodenough than on the Binet, where the mental age was higher on the 
Goodenough, and where the mental age was the same on both tests was 
listed separately. Only one-third of the cases made a higher score on 
the Goodenough test. By calculating the actual number of months dif- 
ference in mental age on the two tests, the average difference of 5.5 
months lower on the Goodenough was found. This is a much closer 
comparison than the median mental age difference of nine months shown 
in Table 1. 

Using the Product Moment formula the correlation between the mental 
age on the Goodenough Intelligence Test and the Stanford-Binet Ex- 
amination was found to be .717 + .048. This corresponds very closely 
with the average correlation of .763 which Dr. Goodenough found be- 
tween the mental age on her test and the Binet for ages four to twelve 
taken separately. 

CONCLUSION 


The results of this study show that the Goodenough Intelligence Test 
can be used just as satisfactorily with sub-normal children over twelve 
years of age as it can with younger children. 











A NOTE ON THE “‘MULTIPLE CHOICE”’ TEST 


S. EDSON HAVEN AND HERMAN A. COPELAND 


Ohio State University 


Although it is commonly believed that objective measurement, or test- 
ing, in the field of education is of recent origin, it has been pointed out 
that in 1864 the Rev. George Fisher of the Greenwich Hospital School, 
England, was responsible for this item, ‘‘A book, called the ‘Scale- 
Book,’ has been established, which contains the numbers assigned to 
ach degree of proficiency in the various subjects of examination’’ (writ- 
ing, spelling, mathematics, etc.).1 Such attempts must have been 
sporadic and generally unreported.2 Possibly as an outgrowth of the 
success in group testing in the army, the movement began, and for that 
reason seems to be so young. On account of its immaturity, contributions 
to the technique are still to be expected. 

Textbooks often divide tests into recall (an example is the completion) 
and recognition (true-false, matching, multiple response, et al.). Argu- 
ments for and against each of these are easily found. The multiple re- 
sponse has enjoyed popularity, due, perhaps, to the comparative ease of 
construction and scoring, and to the slight compensation for chance which 
it provides. This last type has often been called the ‘‘ multiple choice.’’ 
It is the opinion of the writers that this is a misnomer; as, convention- 
ally, only one choice per test item is recorded and scored; hence it is a 
single choice test. Upon analysis of the items in the mid-term objective 
tests in current use the following is often found: To complete the thought 
of a sentence correctly, several (commonly four) alternatives are presented, 
one of which is best or correct. The directions given are to indicate the 
best alternative. In reality one is best (according to the instructor), 
another is good or very good, a third is fair, and the other(s) is (are) 
trivial. The supposedly best alternative may not be considered the best 
by all students, and so it would most likely be the second choice of sev- 
eral, providing the above description of the alternatives holds true. Un- 
less the objectivity is to be destroyed, we cannot allow credit for one or 
the other and at the same time keep the usual directions. 

In order to observe the nature of these second choices, a preliminary 
test for General Psychology was constructed of 55 true-false statements, 

1E, L. Thorndike, Jour. of Educ. Psychol., 1913, 4, 551-552. 

2 Rudolf Pintner, Educ. Psychology, New York, 1929. Footnote p. 350. 
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21 matching items, and 13 four-response problems. The first, second, 
third, and fourth choices to each problem were recorded by the students. 
Each choice was scored, and for that reason was considered as a particular 
item. Then tetrachoric correlations were computed between the criterion, 
i.e., the score on the true-false and matching parts, and each of these 
items (52 correlations in all). Those items whose probability was at 
least nine in ten chances of being positive upon repetition were consid- 
ered significant. Table 1 shows the percentages of significant items ac- 
cording to choices. As the first and second choices seem to be the most 


TABLE 1 
Per cent of Significant Tetrachoric Correlations 


TEST 1 TEST 2 
Criterion and choice 1 31 69 
Criterion and choice 2 38 44 
Criterion and choice 3 44 
Criterion and choice 4 s 56 


important, correlations between the total score on each of these and the 
criterion were computed. These coefficients are shown in Table 2, and are 
to be considered as being merely suggestive. 


TABLE 2 
Correlations Between Criterion and Total Correct 





TEST 1 TEST 2 
Criterion and choice 1 | 0.47 + .068 0.49 + .069 
Criterion and choice 2 | 0.22 .083 | 0.26 + .085 
Criterion and choice 3 not | 0.23 + .086 
Criterion and choice 4 computed | 0.12 + .090 


A second test was constructed of 121 true-false statements and 16 four 
response problems. Significant choices were calculated by tetrachoric 
correlations as previously done, and the percentages are shown in Table 
1. As all seem to be worthy of consideration, correlations between the 
criterion, the 121 true-false statements, and the total score on each choice 
were computed and are shown in Table 2. This indicates that the first 
choice or best alternative is the most important. It is the opinion of the 
writers that much depends upon the wording of the test items as to the 
discrimination value of the several choices. Then, the decision that an 
alternative is to be evaluated as a particular, e.g., third or fourth, choice 


cannot always be definitely established and may affect the correlations. 
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To test the assumption that a second choice may be worthy of con- 
sideration, we proceeded to calculate a correlation between the criterion 
and the total of the first choices correctly indicated plus the total of the 
‘*best’’ alternatives indicated in the second position. The coefficient 
equalled 0.58 + .060, which we believe to be a significant increase in 
prognostic value over the 0.49 of the first choice. 

To study empirically the effect of weighting the first choice, first it was 
weighted two and added to the second, resulting in a coefficient of cor- 
relation 0.57 + .061; then it was weighted three and added to the second 
choice, resulting in a coefficient of 0.51 + .067. Evidently weighting in 
this particular instance is not worth the effort involved. 

In conclusion it is our belief that the indication of more than one 
choice on a multiple response test of a few carefully selected items is of 
more value for prognostication than a test of many items in which only 
the first choice is recorded. How these shall be scored must be determined 
by statistical evaluation. 











NOTES AND NEWS 


To perpetuate the peace ideals of the late David Starr Jordan, for 
many years President of Stanford University, the ‘‘ World Unity Memo- 
rial To David Starr Jordan’’ has been established by a number of his 
friends and associates in this country and abroad. The Memorial, 
sponsored by Mrs. Jordan, is being promoted by an international com- 
mittee composed of Hamilton Holt, Jane Addams, Manley O. Hudson 
and Salmon O. Levinson representing the United States; Sir Norman 
Angell representing England; Hans Wehberg, Germany; Joseph Red- 
lich, Austria, and Baron Y. Sakatani, Japan. ‘‘The purpose of this 
Memorial is to make possible the wider diffusion of Dr. Jordan’s impor- 
tant statements on peace and international cooperation by magazine and 
pamphlet publication, and to encourage the rise of the peace spirit 
among the new generation of college students.’’ Memorial Headquar- 
ters, 4 East 12th Street, New York City. 


The Sixth World Conference of the New Education Fellowship will 
be held in Nice, France, from July 29 to August 12, 1932. Education 
and Changing Society is to be the theme of the conference. Some of 
the most distinguished educators and publicists in the world will appear 
on the program and delegates are expected from many countries. 
Frances Fenton Park, 425 West 123rd Street, New York City, is Secre- 
tary. 


The Biographical Directory of Leaders in Education, edited by J. 
McKeen Cattell, has taken its place among the most important works 
of reference. It contains biographies of about 11,000 of those in 
America who have done the most to advance education, whether by 
teaching, writing, research or administration, a careful selection from 
the million educational workers of the United States and Canada. They 
are those to whom daily reference is made in the press, from whom all 
positions of importance are filled. It should be a book essential to all 
who have relations with those engaged in educational work and neces- 
sary to every reference library. It is published by the Science Press, 
Grand Central Terminal, New York City, and Lancaster, Pa. 


Stanford University announces that Dr. Kurt Lewin of the Univer- 
sity of Berlin will be acting professor of psychology during the summer 
quarter of 1932 and for the two following quarters at that institution. 
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More than 50,000 scholarships and fellowships are available annually 
in the United States, with a total money value of approximately 
$10,000,000, according to a bulletin of the Federal Office of Education. 
These are offered by 402 colleges and universities. Twenty-two states 
now furnish, by legislative enactment to institutions in the state, some 
sort of scholarship aid. ‘‘The donation of funds for scholarship pur- 
poses is a form of philanthropy which has gained impetus since the 
World War when a greater desire for college and university training 
was evidenced,’’ states Miss Ella B. Ratcliffe, chief educational assistant 
in the division of colleges and professional schools. This information 
should be of vital interest to many thousands of students who need finan- 
cial assistance to enable them to complete their education. 


The British Government has extended an invitation to the United 
States Government to participate in the 12th International Congress on 
Commercial Education to be held in London during the last week of 
July, 1932. The purpose of the congress is to bring together leaders in 
secondary and higher education for business from various countries to 
exchange ideas about outstanding problems and practices in business 
education. Due to the rapidly changing social and economic conditions 
throughout the world, the congress proposes to emphasize the newer 
social and institutional factors affecting business education. 


The National Probation Association, made up of judges, probation 


officers, psychiatrists and others interested in the scientific treatment of 
crime, announces the publication of its 1931 Year Book. This contains 
the latest information on probation, juvenile courts and crime preven- 
tion and should be a valuable source book for those in court professions 
and other types of social work. The contributors to this publication, 
some of the ablest authorities in these fields, urge the extension of child 
guidance clinics in school systems for the purpose of discovering and 
treating behavior problems before antisocial habits are deeply rooted. 
Clinics for mental and physical examination of court cases are also advo- 
cated. Those interested in the Year Book may place their order with 
the National Probation Association, 450 Seventh Avenue, New York 


City. The price is $1.00, paper bound; $1.50, cloth and board. 


Several colleges and some of the major engineering and industrial 
organizations of the country are represented in an experiment in voca- 
tional or collegiate guidance for boys which, according to a recent 
announcement by President Harvey N. Davis of Stevens Institute of 
Technology, will be made at the Stevens Engineering Camp in northern 
New Jersey for two weeks next summer, August 13 to 27. Boys now in 
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high school and preparatory schools, preferably those who will enter col- 
lege in 1933, are to attend the camp. ‘‘One of the more important 
decisions of a boy’s life,’’ says President Davis, ‘‘is his choice of a 
college, not so much his choice of a particular college as the type of 
college for which he is best fitted and which may lead him to a career 
where he will find satisfaction.’’ Several of the lectures to be given 
will touch on interests other than the professional interests of engineer- 
ing. Among the speakers will be the following: Dr. Walter V. Bing- 
ham, Director of Personnel Research Federation and Lecturer at Stevens 
Institute of Technology; Clarence F. Hirschfeld, Director of Research, 
Detroit Edison Company; Professor Johnson O’Connor, of Stevens 
Institute of Technology; and President Harvey N. Davis. 


Dr. Goodwin Watson, Columbia University, now on a year’s leave of 
absence in Europe, will have charge of a psychology study group this 
summer under the auspices of the new American Peoples College in 
Europe and the Pocono study tours. The group, to be composed of men 
and women between the ages of 18 and 30, will sail June 11th on the 
S. S. Homeric. The three months’ trip, to include Oetz, Vienna, France, 
Switzerland, Germany, Finland, Denmark and England, will cost about 
$390. Dr. Soren A. Mathiasen, one of the sponsors of the movement, 
may be reached at 55 West 42nd Street, New York City. 








BOOK REVIEWS 


Burtt, Harotp ErNeEst. Legal Psychology. Prentice-Hall, Inc., New 
York. 1931. 467 pp. $6.00. 

The task of reviewing Dr. Burtt’s book has been a pleasant excursion 
into a field that is new to the reviewer. The task was essayed with some 
hesitancy because of the feeling that the work probably was very tech- 
nical. It might be well to state at the outset, therefore, that the author 
of Legal Psychology has been successful in making it understandable to 
those not trained in the field of psychology. Members of the legal pro- 
fession, police officials, social workers, and others coming into contact 
with crime and its detection and prevention will find in one book for 
the first time a compilation of all the psychological material of value in 
this field. It represents also a keen analysis and interesting presenta- 
tion of many of the problems of criminology. 

Dr. Burtt points out in his introduction that psychological facts and 
principles are not new to the legal profession. Gradually through the 
centuries empirical techniques have been built up by lawgivers, judges, 
juries, legislators and lawyers having underlying assumptions as to how 
people will act in certain situations. These assumptions form a sort of 
common-sense psychology which has been very necessary and useful to 
legal practitioners. The law, however, has been ready to adopt scientific 
principles as developed in various branches of knowledge, and now that 
psychology has reached an advanced point of development as a science, 
it seems likely that some of its principles will be taken over. 

It was with the aim of facilitating this absorption by the legal pro- 
fession of the facts and principles of the science of psychology that the 
present book was written. That there is much that psychology can give 
to the legal profession is shown over and over again by the fact that 
most of the techniques discussed in the book have had as yet very little 
practical application. 

The material has been organized under three headings, the psychology 
of testimony, the psychology of the criminal and the psychology of crime 
prevention. The psychology of testimony is considered in Chapters II 
to VII, inclusive. Dr. Burtt’s approach is that of the psychologist, the 
various psychological categories rather than the different types of evi- 
dence determining the organization of his material. A consideration of 
the psychology of sensation and perception logically is given first place, 
since errors made by a witness at the time of his original observation 


preclude the giving of accurate testimony. The psychology of sugges- 
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tion and its bearing on testimony next is fully treated. Considerable 
attention is given by the author to the form of the question presented 
by attorneys both in direct and in cross examination. The various 
methods of obtaining testimony are compared and experiments cited to 
show their relative reliability. Chapter VII, the last chapter dealing 
with the psychology of testimony, takes up the Problems of the Jury 
and the Judge. 

Beginning with Chapter VIII, the author turns his attention to the 
second division of his subject, namely, the psychology of the criminal. 
The question of confessions is first taken up, and the various methods 
pursued in obtaining them are evaluated. One of the more scientific of 
these methods, namely, the ‘‘association reaction method’’ of detecting 
crime, is taken up in Chapter IX, while ‘‘ Breathing and Crime Detec- 
tion’’ forms the subject of Chapter X. The techniques discussed in this 
chapter and Chapter XI attempt to catch somewhat the same emotional 
factors as those considered in Chapter IX, but the method is somewhat 
different. In Chapters XII and XIII, the author turns from a consid- 
eration of the psychology bearing directly on the accuracy of testimony 
and on the obtaining of the truth from the criminal to a consideration 
of the characteristics of the criminal himself. 

In the third division of the book, which covers the psychology of crime 
prevention, the author considers in turn in Chapter XIV to XX the sub- 
jects of Predelinquency, Eugenics, Punishment, Drugs, Suggestion and 
Imitation, Education and Crime Prevention, and Trade-Mark Infringe- 
ment. 

The book closes with a chapter summarizing the main facts brought 
out and emphasizes those points which have received some measure of 
recognition in legal procedure. Another very valuable feature of the 
book is the concise summary to be found at the end of each chapter. 

Oscar 8. NELSON, 
University of Pennsylvania. 


GARRY CLEVELAND Myers. Building Personality in Children. Green- 
berg, New York, 1931. Pp. XV +360. 

To the critical young psychologist Dr. Myers’ book may seem simple, 
popular and ‘‘common sensy’’ almost to the point of superficiality. 
But to the perplexed young parent it will come as a genuine source of 
guidance. Written not for the college classroom but for the parent and 
teacher dealing directly with the child, it presents a great variety of 
concrete situations with their possible solutions. Teaching Children 
Care of Clothes, Psychology of Child Posture, The Child Who Doesn’t 
Talk Enough, The Proper Use of Money, are a few topics selected at 
random. 
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When the applied psychologist limits himself too rigidly to experiment 
and investigation his science all too often fails to be applicable to much 
of anything. And the too hasty or too strenuous attempt to apply more 
or less pure science to practical situations is often lpdicrously far- 
fetched. In his book Dr. Myers makes no attempt to give the reader 
the impression that he is being especially ‘‘scientific’’ nor that he is 
‘fapplying’’ science. But perhaps for just this reason his book, sim- 
ple, direct, and sensible as it is, represents applied psychology at its 
best. 

A. C. ANDERSON, 
Ohio University. 


ADLER, ALFRED. Problems of Neurosis. Cosmopolitan Book Corpora 
tion. New York, 1930. 


The Problems of Neurosis, with a Prefatory Essay by F. G. Crook- 
shank, is a further contribution of the Adlerian type to the field of 
Individual Psychology. Dr. Crookshank, a pupil of Adler, in character 
izing Individual Psychology has the following to say of its content and 
value: ‘‘The science of Individual Psychology teaches us that the leit- 
motif of all neurosis and conflict is a sense of discouragement and in- 
feriority. But the keynote of the practice of Individual Psychology is 


that of ‘benevolent comradeship’ which Adler tells us should character- 
ize the attitude of the physician towards his patient.’’ The Preface 
writer, after paying his respects in none too flattering phrases to what 
he ealls ‘‘necrological’’ and ‘‘veterinary’’ schools of medicine, pro- 
ceeds to argue for the newer school of ‘‘ psychological’? medicine as 
epitomized by Adler and his followers. The rest of his Essay is the 
analysis of the ‘‘new principle’’ of Adler, organic-inferiorities and 
‘the doctrine of the development of neurosis and psychoneurosis in 
organic-inferiorities. He further claims that not only does Adler, in 
this book, treat of the doctrines of organic inferiority, of compensation, 
and of the life goal, which constitutes Individual Psychology, but he 
also shows the individual-in-the-community relationship which is a dif 
ferent thing from the individual himself. 

‘*The problem of every neurosis is, for the patient, the difficult main 
tenance of style of acting, thinking, and perceiving which distorts and 
denies the demands of reality,’’ states Adler in Chapter 1 of his book. 
Adler further states that ‘‘an individual goal of superiority is the deter- 
mining factor in every neurosis, but the goal itself always originates 
in—and is strictly conditioned by—the actual experience of inferiority. ’’ 
It is, therefore, the physicians’ problem to discover the causes of the 
feelings of inferiority. Since the feelings of inferiority are regarded 
as a sign of weakness the patient has a strong tendency to conceal them. 
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In order to bring them out, the physician must establish rapport with 
his patient. Contrary to the Freudian concept that neuroses originate 
in the sex libido Adler conceives them as resulting when the patient 
fails to adjust to the three real problems of life—namely, to society, to 
occupation and to love. The expressions of failure of adjustment to 
these three problems are protean. They may be in anxiety, melancholy, 
feelings of guilt, masculine protest, inferiority, ete. Adler claims, ‘‘the 
best way to understand a neurotic patient is to set aside all his neurotic 
symptoms, and to study his style of life and his individual goal of supe- 
riority.’’ (Italics mine.) ‘‘It is fear of defeat, real or imaginary, 
which oceasions the outbreak of the so-called neurotic.’’ 

The mother’s early influence is a potent force in the child’s life while 
its first feelings of inadequacy are being formed. While in this help- 
less formative stage she has the best chance to translate society for him; 
and in doing so she may give him proper or improper ‘‘ social feeling.’’ 
In fact the prototype of the child’s life is formed in the first four or 
five years of life. Thus the child’s attitude formed toward the ‘‘major 
problems’’ of life furnish the basis of his later ‘‘style of acting.’’ 

The first child, second child and youngest child all are born into dif- 
ferent psychical attitudes of parents and into different economic states 
—hence each patient is an individual psychological problem. Case upon 
case of ‘‘marriage failure,’’ ‘‘love affairs,’’ ‘‘divorces,’’ ‘‘quitters in 
life’’ are briefly analyzed for the reader. The evidence from these 
cases is marshalied to support the author’s doctrines of inferiority and 
the style of life with all their implications. 

Adler’s technique of encouraging his patients, once he has made them 
realize their difficulties, and of going straight at problems must bespeak 
for him marvelous personal power in ‘‘analysis.’’ While many of his 
postulates smack not of science but of philosophy certainly something 
may be said for his helpfulness to his patients. One sees reflected in 
the Adlerian conception of the genetic growth of personality a greater 
influence of Janet’s teachings than of Freud’s, although Adler’s begin- 
ning was identified with the psychoanalysis of Freud. However much 
one may differ from some of the tenets held by the Freudians, the 
Adlerians and the Jungians one is compelled to recognize that the 
psychoanalytic school, with all of its branches, has made solid contribu- 
tions to the study of child psychology and to personality development. 
The Problems of Neurosis is an interesting book and full of case 
material. 


JAMES R. PATRICK, 
Ohio University. 
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FRYER, DouGLAs. Measurement of Interests. Henry Holt and Com- 
pany, 1931. 488 p.+ xxxvi. 


Fryer’s book is a very excellent summary of the work on measure- 
ment of interests, both subjective and objective. The first nine chap- 
ters present an historical sketch of the work done. The presentation is 
really more than a sketch. Usable descriptions of techniques, item 


selection, methods of statistical evaluation of items, procedures for in- 
vestigating the validities, easier scoring methods, ete., of the interest 
scales and inventories are given. Throughout the book case histories 
are given which serve well to illustrate various sorts of interests and 
changes of interests. At the end of each chapter, we find a bibliog- 
raphy with the more important references starred. 

In Chapter X, Fryer presents an ‘‘acceptance-rejection’’ theory of 
interests. ‘‘In the inventorying of likes and dislikes, the acceptance 
and rejection of objects of stimulation is definitely what this measure- 
ment aims to measure. In the information test, applied to the measure- 
ment of interests, there is an acceptance or rejection in a definite field 
of information. In the free association test an acceptance and rejec- 
tion of stimulation is indicated in the response which is scored in the 
same manner as the subjected inventory, for acceptance, by similarity 
to group interests, and for rejection, by lack of similarity. While moti- 
vation would seem to influence these measures, its influence would be 
the same as that upon a measure of ability. It is extraneous to the 
measurement. ’’ 

In fact, it seems that Chapter X is the most outstanding chapter in 
the entire book, presenting in a very concise and readable style the ecriti- 
cal summary of the work on measurement of interests. 

It would seem that the clinical use of interest measurements is for 
adjustment of the individual, rather than a particular prognosis, and 
that as far as vocational guidance is concerned, that measures of inter- 
est are more usable in placement rather than in guidance as such. 

HAROLD A. EDGERTON, 
Ohio State University. 


ALENE RALSTON AND CATHERINE J. GAGE. Present Day Psychology: an 
Objective Study in Educational Psychology. Chicago: J. B. Lip- 
pincott Co., 1931. xiv, 404. 


In presenting this book, the authors, according to their statement, are 
attempting ‘‘to put the fundamental facts of educational psychology in 
a language simple enough for the uninitiated to read and understand.’’ 
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They have succeeded but in so doing have created an impression of sim- 
plicity and certainty which does not exist in psychology. The authors 
make no claim for originality. That is no sin. They should, however, 
have been more critical in their ‘‘gleanings,’’ and more careful in their 
interpretation. It is quite evident that they are not experimentalists 
because no experimentalist in stating so many ‘‘fundamental ‘acts’’ 
would make so few qualifications. A great amount of materia. from 
many authors is presented, but not all of it is the result of careful ex- 
perimentation. Speculation has at least sired some of it. 

The book is divided into three parts: Living, Learning and Measur- 
ing. In parts one and two the authors draw rather heavily from Thorn- 
dike’s Educational Psychology. Thorndike’s discussion of Original 
Nature and how to modify it is the ground work for part one. Many 
of his terms and classifications are used. Satisfiers and annoyers are 
valled upon quite frequently. Original and acquired bonds are spoken 
of with certainty and great intimacy. Thorndike’s classification of in- 
stincts and his three levels of original nature, i.e., reflex, instinct and 
capacity, are used. Watson’s work on conditioning and reconditioning 
is cited as an illustration of how original nature may be modified. In 
addition, feelings, attitudes, emotions, intelligence and habit as related 
primarily to the child and adolescent are all very simply discussed in 
part one. 

Part two begins with a discussion and a complete acceptance of 
Thorndike’s primary and secondary laws of learning. Following this 
and in this order are chapters on improvement, permanence, transfer of 
training, fatigue and individual differences in which the usual findings 
and conclusions are stated. The next chapter, however, which is devoted 
to a simple and brief discussion of five schools of psychology represent- 
ing the modern trend, is unusual in that it is found in the section on 
learning in an educational psychology text. The five schools mentioned 
are behaviorism, psychoanalysis, dynamic psychology, purposivism and 
gestalt psychology. The last chapter in part two is unusual also in that 
it is a sort of glossary coming in the middle of the book. 

The third part of the book is devoted to the definition, description, 
history and uses of the many kinds of tests. ‘‘Here an endeavor has 
been made to create a tolerant attitude toward testing rather than to 
give instruction in the technique of testing.’’ The last chapter is on 
statistical treatment. 

This book is easy reading and is rather comprehensive as far as the 
number of the topics discussed is concerned. It might be recommended 
to the parent or to any one who has no further ambition or need than 
to be simply ‘‘initiated.’’ It might even be recommended to the see- 
ondary school teacher who must fulfill a psychology requirement for 
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teaching. This type of psychology can be applied in the schoolroom 
and, the less doubt there is in the mind of the teacher, the more success- 
fully it can be applied. This text should be recommended no further. 
It should be avoided as so much poison, particularly by the student who 


plans to take more than one course in psychology and in the later courses 
would have to ‘‘unlearn’’ many of the things which he learned here. 
T. C. Scort, 


Ohio University. 
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